Skip to main content

This package boosts a sparse matrix multiplication followed by selecting the top-n multiplication

Project description

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

    import numpy as np
    from scipy.sparse import csr_matrix
    from scipy.sparse import rand
    from sparse_dot_topn import awesome_cossim_topn

    N = 10
    a = rand(100, 1000000, density=0.005, format='csr')
    b = rand(1000000, 200, density=0.005, format='csr')

    c = awesome_cossim_topn(a, b, 5, 0.01)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sparse_dot_topn-0.2.3-cp37-cp37m-macosx_10_12_x86_64.whl (31.2 kB view details)

Uploaded CPython 3.7mmacOS 10.12+ x86-64

sparse_dot_topn-0.2.3-cp27-cp27m-macosx_10_12_intel.whl (61.5 kB view details)

Uploaded CPython 2.7mmacOS 10.12+ Intel (x86-64, i386)

File details

Details for the file sparse_dot_topn-0.2.3-cp37-cp37m-macosx_10_12_x86_64.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.3-cp37-cp37m-macosx_10_12_x86_64.whl
  • Upload date:
  • Size: 31.2 kB
  • Tags: CPython 3.7m, macOS 10.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.10

File hashes

Hashes for sparse_dot_topn-0.2.3-cp37-cp37m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 634d2742fdeabb93820a3498cf4d49548e80f37c90d1095f7ba995ddf9689763
MD5 5e3359825910e4493738a66034008bb3
BLAKE2b-256 5da5032a42957ae31421d79a3b588c31a1722bb30eb7b97975a1cfd76a9e5342

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.3-cp27-cp27m-macosx_10_12_intel.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.3-cp27-cp27m-macosx_10_12_intel.whl
  • Upload date:
  • Size: 61.5 kB
  • Tags: CPython 2.7m, macOS 10.12+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.10

File hashes

Hashes for sparse_dot_topn-0.2.3-cp27-cp27m-macosx_10_12_intel.whl
Algorithm Hash digest
SHA256 7feacb8851dd913c741c1058a76dd8ae32458eb7c2cc2a5c674afc2a7b1ca74e
MD5 e7d5853a1125fe2bdb8de81810b88390
BLAKE2b-256 05f321de0c0bbe11cd39772a0250872ed356b63829b399f3165bf68b751ab2c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page