Skip to main content

The Blis BLAS-like linear algebra library, as a self-contained C-extension.

Project description

Cython BLIS: Fast BLAS-like operations from Python and Cython, without the tears

This repository provides the Blis linear algebra routines as a self-contained Python C-extension.

Currently, we only supports single-threaded execution, as this is actually best for our workloads (ML inference).

Azure Pipelines pypi Version conda Python wheels

Installation

You can install the package via pip:

pip install blis

Wheels should be available, so installation should be fast. If you want to install from source and you're on Windows, you'll need to install LLVM.

Building BLIS for alternative architectures

The provided wheels should work on x86_86 architectures. Unfortunately we do not currently know a way to provide different wheels for alternative architectures, and we cannot provide a single binary that works everywhere. So if the wheel doesn't work for your CPU, you'll need to specify source distribution, and tell Blis your CPU architecture using the BLIS_ARCH environment variable.

a) Installing with generic arch support

BLIS_ARCH="generic" pip install spacy --no-binary blis

b) Building specific support

In order to compile Blis, cython-blis bundles makefile scripts for specific architectures, that are compiled by running the Blis build system and logging the commands. We do not yet have logs for every architecture, as there are some architectures we have not had access to.

See here for list of architectures. For example, here's how to build support for the ARM architecture cortexa57:

git clone https://github.com/explosion/cython-blis && cd cython-blis
git pull && git submodule init && git submodule update && git submodule status
python3 -m venv env3.6
source env3.6/bin/activate
pip install -r requirements.txt
./bin/generate-make-jsonl linux cortexa57
BLIS_ARCH="cortexa57" python setup.py build_ext --inplace
BLIS_ARCH="cortexa57" python setup.py bdist_wheel

Fingers crossed, this will build you a wheel that supports your platform. You could then submit a PR with the blis/_src/make/linux-cortexa57.jsonl and blis/_src/include/linux-cortexa57/blis.h files so that you can run:

BLIS_ARCH=cortexa57 pip install --no-binary=blis

Running the benchmark

After installation, run a small matrix multiplication benchmark:

$ export OMP_NUM_THREADS=1 # Tell Numpy to only use one thread.
$ python -m blis.benchmark
Setting up data nO=384 nI=384 batch_size=2000. Running 1000 iterations
Blis...
Total: 11032014.6484
7.35 seconds
Numpy (Openblas)...
Total: 11032016.6016
16.81 seconds
Blis einsum ab,cb->ca
8.10 seconds
Numpy einsum ab,cb->ca
Total: 5510596.19141
83.18 seconds

The low numpy.einsum performance is expected, but the low numpy.dot performance is surprising. Linking numpy against MKL gives better performance:

Numpy (mkl_rt) gemm...
Total: 11032011.71875
5.21 seconds

These figures refer to performance on a Dell XPS 13 i7-7500U. Running the same benchmark on a 2015 MacBook Air gives:

Blis...
Total: 11032014.6484
8.89 seconds
Numpy (Accelerate)...
Total: 11032012.6953
6.68 seconds

Clearly the Dell's numpy+OpenBLAS performance is the outlier, so it's likely something has gone wrong in the compilation and architecture detection.

Usage

Two APIs are provided: a high-level Python API, and direct Cython access. The best part of the Python API is the einsum function, which works like numpy's, but with some restrictions that allow a direct mapping to Blis routines. Example usage:

from blis.py import einsum
from numpy import ndarray, zeros

dim_a = 500
dim_b = 128
dim_c = 300
arr1 = ndarray((dim_a, dim_b))
arr2 = ndarray((dim_b, dim_c))
out = zeros((dim_a, dim_c))

einsum('ab,bc->ac', arr1, arr2, out=out)
# Change dimension order of output
out = einsum('ab,bc->ca', arr1, arr2)
assert out.shape == (dim_a, dim_c)
# Matrix vector product, with transposed output
arr2 = ndarray((dim_b,))
out = einsum('ab,b->ba', arr1, arr2)
assert out.shape == (dim_b, dim_a)

The Einstein summation format is really awesome, so it's always been disappointing that it's so much slower than equivalent calls to tensordot in numpy. The blis.einsum function gives up the numpy version's generality, so that calls can be easily mapped to Blis:

  • Only two input tensors
  • Maximum two dimensions
  • Dimensions must be labelled a, b and c
  • The first argument's dimensions must be 'a' (for 1d inputs) or 'ab' (for 2d inputs).

With these restrictions, there are ony 15 valid combinations – which correspond to all the things you would otherwise do with the gemm, gemv, ger and axpy functions. You can therefore forget about all the other functions and just use the einsum. Here are the valid einsum strings, the calls they correspond to, and the numpy equivalents:

Equation Maps to Numpy
'a,a->a' axpy(A, B) A+B
'a,b->ab' ger(A, B) outer(A, B)
'a,b->ba' ger(B, A) outer(B, A)
'ab,a->ab' batch_axpy(A, B) A*B
'ab,a->ba' batch_axpy(A, B, trans1=True) (A*B).T
'ab,b->a' gemv(A, B) A*B
'ab,a->b' gemv(A, B, trans1=True) A.T*B
'ab,ac->cb' gemm(B, A, trans1=True, trans2=True) dot(B.T, A)
'ab,ac->bc' gemm(A, B, trans1=True, trans2=False) dot(A.T, B)
'ab,bc->ac' gemm(A, B, trans1=False, trans2=False) dot(A, B)
'ab,bc->ca' gemm(B, A, trans1=False, trans2=True) dot(B.T, A.T)
'ab,ca->bc' gemm(A, B, trans1=True, trans2=True) dot(B, A.T)
'ab,ca->cb' gemm(B, A, trans1=False, trans2=False) dot(B, A)
'ab,cb->ac' gemm(A, B, trans1=False, trans2=True) dot(A.T, B.T)
'ab,cb->ca' gemm(B, A, trans1=False, trans2=True) dot(B, A.T)

We also provide fused-type, nogil Cython bindings to the underlying Blis linear algebra library. Fused types are a simple template mechanism, allowing just a touch of compile-time generic programming:

cimport blis.cy
A = <float*>calloc(nN * nI, sizeof(float))
B = <float*>calloc(nO * nI, sizeof(float))
C = <float*>calloc(nr_b0 * nr_b1, sizeof(float))
blis.cy.gemm(blis.cy.NO_TRANSPOSE, blis.cy.NO_TRANSPOSE,
             nO, nI, nN,
             1.0, A, nI, 1, B, nO, 1,
             1.0, C, nO, 1)

Bindings have been added as we've needed them. Please submit pull requests if the library is missing some functions you require.

Development

To build the source package, you should run the following command:

./bin/copy-source-files.sh

This populates the blis/_src folder for the various architectures, using the flame-blis submodule.

Updating the build files

In order to compile the Blis sources, we use jsonl files that provide the explicit compiler flags. We build these jsonl files by running Blis's build system, and then converting the log. This avoids us having to replicate the build system within Python: we just use the jsonl to make a bunch of subprocess calls. To support a new OS/architecture combination, we have to provide the jsonl file and the header.

Linux

The Linux build files need to be produced from within the manylinux1 docker container, so that they will be compatible with the wheel building process.

First, install docker. Then do the following to start the container:

sudo docker run -it quay.io/pypa/manylinux1_x86_64:latest

Once within the container, the following commands should check out the repo and build the jsonl files for the generic arch:

mkdir /usr/local/repos
cd /usr/local/repos
git clone https://github.com/explosion/cython-blis && cd cython-blis
git pull && git submodule init && git submodule update && git submodule
status
/opt/python/cp36-cp36m/bin/python -m venv env3.6
source env3.6/bin/activate
pip install -r requirements.txt
./bin/generate-make-jsonl linux generic --export
BLIS_ARCH=generic python setup.py build_ext --inplace
# N.B.: don't copy to /tmp, docker cp doesn't work from there.
cp blis/_src/include/linux-generic/blis.h /linux-generic-blis.h
cp blis/_src/make/linux-generic.jsonl /

Then from a new terminal, retrieve the two files we need out of the container:

sudo docker ps -l # Get the container ID
# When I'm in Vagrant, I need to go via cat -- but then I end up with dummy
# lines at the top and bottom. Sigh. If you don't have that problem and
# sudo docker cp just works, just copy the file.
sudo docker cp aa9d42588791:/linux-generic-blis.h - | cat > linux-generic-blis.h
sudo docker cp aa9d42588791:/linux-generic.jsonl - | cat > linux-generic.jsonl

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blis-0.7.0.tar.gz (2.4 MB view details)

Uploaded Source

Built Distributions

blis-0.7.0-cp38-cp38-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

blis-0.7.0-cp38-cp38-manylinux2014_x86_64.whl (9.8 MB view details)

Uploaded CPython 3.8

blis-0.7.0-cp38-cp38-macosx_10_9_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

blis-0.7.0-cp37-cp37m-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

blis-0.7.0-cp37-cp37m-manylinux2014_x86_64.whl (9.8 MB view details)

Uploaded CPython 3.7m

blis-0.7.0-cp37-cp37m-macosx_10_9_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

blis-0.7.0-cp36-cp36m-win_amd64.whl (6.2 MB view details)

Uploaded CPython 3.6m Windows x86-64

blis-0.7.0-cp36-cp36m-manylinux2014_x86_64.whl (9.8 MB view details)

Uploaded CPython 3.6m

blis-0.7.0-cp36-cp36m-macosx_10_9_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file blis-0.7.0.tar.gz.

File metadata

  • Download URL: blis-0.7.0.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0.tar.gz
Algorithm Hash digest
SHA256 dc4ee93004c7e65631c8077638e0176612941e2c683fbc4b05a4397132424443
MD5 115df32e642b74438faf8385e59bef2c
BLAKE2b-256 5c147bbb32e2a0f667b5ad53d7e926b01c0365e1eac1c42c87dad54c1171c413

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: blis-0.7.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 6.3 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 535ec4c312ac47465e57048e94cd3efd9e6deb6e0c47972523cbdf2d80398d78
MD5 2da972a3e232d4f74624bb520e8b533d
BLAKE2b-256 907b5ee5f2a4c73c7f7f6fa252891d61a4378464cdb956aa199fead55d9c3b93

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: blis-0.7.0-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 9.8 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7deefd27bd50866a813e05b2bc6cb3e38dac469999dd4b6ab519147b20a9280f
MD5 bd9988e5026934903a05227235129471
BLAKE2b-256 cce176f187b74887c96e80d0dfb20155a3df74fdb45ffc5bc8531b5f8fed7125

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: blis-0.7.0-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 5.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 8c29eb1c585e42b39a85362c8a26377597e4da924432974b84275feeaa1c856d
MD5 a9fdea8571af22e13d7550c3f801a5cc
BLAKE2b-256 96007de274f9f545edfe716da4abbbe61bf8c5f199b2c8872273492690655b7d

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: blis-0.7.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 6.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 4123469456d66e0f04246f7170b848ac03aec2894362d8d0ba34ad204d2d38bc
MD5 8b32ddad25b99f75909cdf88012c56da
BLAKE2b-256 a3501c39c56cc5a8f4f7717f88599a40a202ced8938054b8dfbd1cf7ca049aba

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: blis-0.7.0-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 9.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 03d6d8ae63d627109d12044bb993beac4f0bbb907d2f55fde25005e7c8b86dbb
MD5 b217d1a51c572897fcef27123b67e693
BLAKE2b-256 e3c88ac1174c199641c7852352de33264b1120a1d5a4f26e551d34c2422f9bc1

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: blis-0.7.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 5.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c5259ccef1537e058284429f087b5ac127640d4071b985bdb40236da6397d301
MD5 5a9804d65531fba0d263032017732651
BLAKE2b-256 f53d3cb44c1f7ebc3878f25734b4396f17e81783ddddfa07c466ac6b291033db

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: blis-0.7.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 6.2 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 566fe30f734683f1852ffe532b6ca3a0716e5067a9f42adf9430fd2b485f1e7f
MD5 75c40e92ee2cfb52d27449d7e0950467
BLAKE2b-256 5614f9758fbefb432cec653b237ca4a5519008861d29644894aee327eb7f6fff

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: blis-0.7.0-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 9.8 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 210f17f62eae66a4eb2fa16429f0c8b98252615dab172938f32b8ace7eb47ba0
MD5 07d78f4588c0c962e6b2193f80585257
BLAKE2b-256 fda4806e4dda4e07a2dc7b1a6f8d03ad006171153cc5a1d6fe0f8c98523753a7

See more details on using hashes here.

File details

Details for the file blis-0.7.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: blis-0.7.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 5.8 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for blis-0.7.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bfe2d8229bc5ebcdbd7d140035672727b4c60b36332c8ea53cc86237b7dd1a0c
MD5 c0f929482608038bb3dac09744fad32b
BLAKE2b-256 541634d1d24e6b64bbd1b457c9986ec70cc1b3ba3fa9ea5d849b84a0d51ab2f7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page