Skip to main content

This package boosts a sparse matrix multiplication followed by selecting the top-n multiplication

Project description

sparse_dot_topn:

sparse_dot_topn provides a fast way to performing a sparse matrix multiplication followed by top-n multiplication result selection.

Comparing very large feature vectors and picking the best matches, in practice often results in performing a sparse matrix multiplication followed by selecting the top-n multiplication results. In this package, we implement a customized Cython function for this purpose. When comparing our Cythonic approach to doing the same use with SciPy and NumPy functions, our approach improves the speed by about 40% and reduces memory consumption.

This package is made by ING Wholesale Banking Advanced Analytics team. This blog explains how we implement it.

Example

    import numpy as np
    from scipy.sparse import csr_matrix
    from scipy.sparse import rand
    from sparse_dot_topn import awesome_cossim_topn

    N = 10
    a = rand(100, 1000000, density=0.005, format='csr')
    b = rand(1000000, 200, density=0.005, format='csr')

    # Use standard implementation

    c = awesome_cossim_topn(a, b, N, 0.01)

    # Use parallel implementation with 4 threads

    d = awesome_cossim_topn(a, b, N, 0.01, use_threads=True, n_jobs=4)

You can also find code which compares our boosting method with calling scipy+numpy function directly in example/comparison.py

Dependency and Install

Install numpy and cython first before installing this package. Then,

pip install sparse_dot_topn

Uninstall

pip uninstall sparse_dot_topn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparse_dot_topn-0.2.6.tar.gz (105.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

sparse_dot_topn-0.2.6-cp37-cp37m-macosx_10_14_x86_64.whl (60.7 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

sparse_dot_topn-0.2.6-cp27-cp27m-macosx_10_14_intel.whl (68.8 kB view details)

Uploaded CPython 2.7mmacOS 10.14+ Intel (x86-64, i386)

File details

Details for the file sparse_dot_topn-0.2.6.tar.gz.

File metadata

  • Download URL: sparse_dot_topn-0.2.6.tar.gz
  • Upload date:
  • Size: 105.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for sparse_dot_topn-0.2.6.tar.gz
Algorithm Hash digest
SHA256 8df3e1568c3a445e763f90d59361499228c2f6a195dea31cdbae32b27150a504
MD5 ce3da810b960d738bc657aba669bece1
BLAKE2b-256 69496deae8f0500355a0a42d4181747721a10881aee4a757305024b89a9d760f

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.6-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.6-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 60.7 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for sparse_dot_topn-0.2.6-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 93fc81e001848bc76e8a7bad6ba0b954bcb082615896029b68f519d04b6371a7
MD5 455dea8897551e78356142c5c0f281fc
BLAKE2b-256 f11c364379e54503440dfd854949b719153c7cd49ce913004833cdcaad610f62

See more details on using hashes here.

File details

Details for the file sparse_dot_topn-0.2.6-cp27-cp27m-macosx_10_14_intel.whl.

File metadata

  • Download URL: sparse_dot_topn-0.2.6-cp27-cp27m-macosx_10_14_intel.whl
  • Upload date:
  • Size: 68.8 kB
  • Tags: CPython 2.7m, macOS 10.14+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for sparse_dot_topn-0.2.6-cp27-cp27m-macosx_10_14_intel.whl
Algorithm Hash digest
SHA256 2ef2e9d609b137b71281ad4b8e9195925ed0cbef46fce28f703f41c6d8b9fcb2
MD5 468950fcafef610cc988ca6f774f8f92
BLAKE2b-256 0230ef33c40391ae23817e4a410461231f1366f423fd69501b899191700d50e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page