Sparse Tools for Analysis
Project description
spartans - SPARse Tools for ANalysiS
When working with sparse matrices, it is desired to have a way to work with them as if they were a regular numpy.arrays. Yet, many popular methods for arrays don’t exist for sparse matrices. spartans wishes to help, with many operations to work with
Full example notebook
Free software: GNU General Public License v3
Documentation: https://spartans.readthedocs.io.
Features
- Mathematical Operations
Rich set of operations not supported on sparse matrices like variance, cov (covariance matrix) and corrcoef (correlation matrix).
- Easy Indexing
Convenient methods to index for “extra” sparse features by variance or by quantity.
- Masking
Many algorithms consider the zeros in a sparse matrix as missing data. Or considering missing data as zeros. Depending on the use-case. spartans
- FeatureMatrix
FeatureMatrix is a spartan's first-class citizen. It is a wrapper around scipy.sparse.csr Matrix built with data analysis and data-science in mind.
Examples
Full example notebook
>>> import spartans as st
>>> from scipy.sparse import csr_matrix
>>> import numpy as np
>>> m = np.array([[1, -2, 0, 50],
[0, 0, 0, 100],
[1, 0, 0, 80],
[1, 4, 0, 0],f
[0, 0, 0, 0],
[0, 4, 0, 0],
[0, 0, 0, -50]])
>>> c = csr_matrix(m)
We can get the the correlation matrix of m using numpy.
>>> np.corrcoef(m, rowvar=False)
Out[]: array([[ 1. , -0.08, nan, 0.31],
[-0.08, 1. , nan, -0.35],
[ nan, nan, nan, nan],
[ 0.31, -0.35, nan, 1. ]])
This won’t work with the sparse matrix c
>>> np.corrcoef(c, rowvar=False)
AttributeError: 'float' object has no attribute 'shape'
But with spartans this can be done.
>>> st.corr(c)
Out[]: array([[ 1. , -0.08, nan, 0.31],
[-0.08, 1. , nan, -0.35],
[ nan, nan, nan, nan],
[ 0.31, -0.35, nan, 1. ]])
The column and row with nan is because the original matrix has a columns (feature) which is zero for the entire column. spartans can handle that using st.non_zero_index(c, axis=0, as_bool=False) which will return array([0, 1, 3]). A lot more functionality is in the notebook.
Credits
This open-source project is backed by SentinelOne
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.0 (2020-02-20)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spartans-0.2.0.tar.gz.
File metadata
- Download URL: spartans-0.2.0.tar.gz
- Upload date:
- Size: 25.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8358421f26b9e120d5785d9359902dc277e258a9b50396de1a457ed8d1899798
|
|
| MD5 |
3879fe59be41708cfa65128724b7d01a
|
|
| BLAKE2b-256 |
ed0a417fa3deb89ce3b5e9e4f7b48cf56a42d29957043609de392a60f5f7c15d
|
File details
Details for the file spartans-0.2.0-py2.py3-none-any.whl.
File metadata
- Download URL: spartans-0.2.0-py2.py3-none-any.whl
- Upload date:
- Size: 18.7 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cbd16b167d0b5d51cdf4d9ee569b631677cbcb4a47d50dad7692a3594e939810
|
|
| MD5 |
9136fc6c096fdfe17acebeb142f00409
|
|
| BLAKE2b-256 |
f755f1a641e37c872e74a3b244500670c74b6ff530c92ab12b9de104569b6942
|