Skip to main content

Sparse Tools for Analysis

Project description

spartans - SPARse Tools for ANalysiS

logo https://img.shields.io/pypi/v/spartans.svg Documentation Status

When working with sparse matrices, it is desired to have a way to work with them as if they were a regular numpy.arrays. Yet, many popular methods for arrays don’t exist for sparse matrices. spartans wishes to help, with many operations to work with

Full example notebook

Features

Mathematical Operations

Rich set of operations not supported on sparse matrices like variance, cov (covariance matrix) and corrcoef (correlation matrix).

Easy Indexing

Convenient methods to index for “extra” sparse features by variance or by quantity.

Masking

Many algorithms consider the zeros in a sparse matrix as missing data. Or considering missing data as zeros. Depending on the use-case. spartans

FeatureMatrix

FeatureMatrix is a spartan's first-class citizen. It is a wrapper around scipy.sparse.csr Matrix built with data analysis and data-science in mind.

Examples

Full example notebook

>>> import spartans as st
>>> from scipy.sparse import csr_matrix
>>> import numpy as np
>>> m = np.array([[1, -2, 0, 50],
                  [0, 0, 0, 100],
                  [1, 0, 0, 80],
                  [1, 4, 0, 0],f
                  [0, 0, 0, 0],
                  [0, 4, 0, 0],
                  [0, 0, 0, -50]])
>>> c = csr_matrix(m)

We can get the the correlation matrix of m using numpy.

>>> np.corrcoef(m, rowvar=False)
Out[]: array([[ 1.  , -0.08,   nan,  0.31],
              [-0.08,  1.  ,   nan, -0.35],
              [  nan,   nan,   nan,   nan],
              [ 0.31, -0.35,   nan,  1.  ]])

This won’t work with the sparse matrix c

>>> np.corrcoef(c, rowvar=False)
AttributeError: 'float' object has no attribute 'shape'

But with spartans this can be done.

>>> st.corr(c)
Out[]: array([[ 1.  , -0.08,   nan,  0.31],
              [-0.08,  1.  ,   nan, -0.35],
              [  nan,   nan,   nan,   nan],
              [ 0.31, -0.35,   nan,  1.  ]])

The column and row with nan is because the original matrix has a columns (feature) which is zero for the entire column. spartans can handle that using st.non_zero_index(c, axis=0, as_bool=False) which will return array([0, 1, 3]). A lot more functionality is in the notebook.

Credits

History

0.1.0 (2020-02-20)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spartans-0.2.0.tar.gz (25.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spartans-0.2.0-py2.py3-none-any.whl (18.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file spartans-0.2.0.tar.gz.

File metadata

  • Download URL: spartans-0.2.0.tar.gz
  • Upload date:
  • Size: 25.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for spartans-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8358421f26b9e120d5785d9359902dc277e258a9b50396de1a457ed8d1899798
MD5 3879fe59be41708cfa65128724b7d01a
BLAKE2b-256 ed0a417fa3deb89ce3b5e9e4f7b48cf56a42d29957043609de392a60f5f7c15d

See more details on using hashes here.

File details

Details for the file spartans-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: spartans-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for spartans-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 cbd16b167d0b5d51cdf4d9ee569b631677cbcb4a47d50dad7692a3594e939810
MD5 9136fc6c096fdfe17acebeb142f00409
BLAKE2b-256 f755f1a641e37c872e74a3b244500670c74b6ff530c92ab12b9de104569b6942

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page