Skip to main content

This package boosts a sparse matrix multiplication followed by selecting the top-n multiplication

Project description

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse import rand
from sparse_dot_topn import awesome_cossim_topn

N = 10
a = rand(100, 1000000, density=0.005, format='csr')
b = rand(1000000, 200, density=0.005, format='csr')

# Use standard implementation

c = awesome_cossim_topn(a, b, N, 0.01)

# Use parallel implementation with 4 threads

d = awesome_cossim_topn(a, b, N, 0.01, use_threads=True, n_jobs=4)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparse_dot_topn-0.2.8.tar.gz (106.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sparse_dot_topn-0.2.8-cp37-cp37m-macosx_10_14_x86_64.whl (60.9 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

sparse_dot_topn-0.2.8-cp27-cp27m-macosx_10_14_intel.whl (69.0 kB view details)

Uploaded CPython 2.7mmacOS 10.14+ Intel (x86-64, i386)

File details

Details for the file sparse_dot_topn-0.2.8.tar.gz.

File metadata

  • Download URL: sparse_dot_topn-0.2.8.tar.gz
  • Upload date:
  • Size: 106.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.8.tar.gz
Algorithm Hash digest
SHA256 05d1d3384cd7c0b2c45aa988b2cbe28f11697b6fbe6f36e79788671064d4bf19
MD5 b573aceb1aaf554aaf65ba8c8dc92c02
BLAKE2b-256 6b6e322880f13aeaae70754cd9d5c99c4bb9a61a3de5a9187d34079fbe0548a2

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.8-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.8-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 60.9 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.8-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5dea8617aff2ed58154a97df37932a47ef85f5566d9e470ea70ad6993f5b9538
MD5 d3cd4d7a6e6b23cb7874ee7f392c3000
BLAKE2b-256 692e20f605d0d54070036dbe39e15c40df2b54a9f4ec2a89dc309c32805290e6

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.8-cp27-cp27m-macosx_10_14_intel.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.8-cp27-cp27m-macosx_10_14_intel.whl
  • Upload date:
  • Size: 69.0 kB
  • Tags: CPython 2.7m, macOS 10.14+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.8-cp27-cp27m-macosx_10_14_intel.whl
Algorithm Hash digest
SHA256 a7a3fdd183540369a17a3db8da4fa49201e2cecf44e116684ccfd6946143c6d1
MD5 57a6cc60d17c208f36ef361b1046f1ac
BLAKE2b-256 f42be7d769665e6b2110aa22647193bfff27424c28adab88866d863c5d9457c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page