Skip to main content

This package boosts a sparse matrix multiplication followed by selecting the top-n multiplication

Project description

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse import rand
from sparse_dot_topn import awesome_cossim_topn

N = 10
a = rand(100, 1000000, density=0.005, format='csr')
b = rand(1000000, 200, density=0.005, format='csr')

# Use standard implementation

c = awesome_cossim_topn(a, b, N, 0.01)

# Use parallel implementation with 4 threads

d = awesome_cossim_topn(a, b, N, 0.01, use_threads=True, n_jobs=4)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparse_dot_topn-0.2.7.tar.gz (106.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sparse_dot_topn-0.2.7-cp37-cp37m-macosx_10_14_x86_64.whl (60.9 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

sparse_dot_topn-0.2.7-cp27-cp27m-macosx_10_14_intel.whl (69.0 kB view details)

Uploaded CPython 2.7mmacOS 10.14+ Intel (x86-64, i386)

File details

Details for the file sparse_dot_topn-0.2.7.tar.gz.

File metadata

  • Download URL: sparse_dot_topn-0.2.7.tar.gz
  • Upload date:
  • Size: 106.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.7.tar.gz
Algorithm Hash digest
SHA256 c929f00068ddd85422adbf58e2fff83c7c6a013dde2b13742d4d52a953866c84
MD5 724b7b7d2464396c19669e9a62b47c51
BLAKE2b-256 4cebc6b6b22d8df8fcfa134e969b61e498a6bf41f1f27e069edee3024ba530db

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.7-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.7-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 60.9 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.7-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 357e025240c38cc0b04a1c2d2d38c3b52f977c31958f355154b9285518230754
MD5 66e3cae8440330dc7dd555d3a5a8b357
BLAKE2b-256 e2801f77af29837719db37dff389d20bd85cce4de964fa2ab05997f353478cbf

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.7-cp27-cp27m-macosx_10_14_intel.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.7-cp27-cp27m-macosx_10_14_intel.whl
  • Upload date:
  • Size: 69.0 kB
  • Tags: CPython 2.7m, macOS 10.14+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for sparse_dot_topn-0.2.7-cp27-cp27m-macosx_10_14_intel.whl
Algorithm Hash digest
SHA256 0b713868963dd3ddc831260be166bb41ab5435d9ce711b40a1e384f5e2b2ac51
MD5 dfb8dca3a9f6bf48a7fe14feb3957cf3
BLAKE2b-256 7f81d5df2c3ec87ab371b60d8c4e6f32723fc0462a2947148cc4ebfc6671b879

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page