Skip to main content

Persistence Diagram Vectorizer

Project description

https://badge.fury.io/py/pervect.svg https://travis-ci.org/scikit-tda/pervect.svg?branch=master https://codecov.io/gh/scikit-tda/pervect/branch/master/graph/badge.svg https://img.shields.io/badge/License-BSD%203--Clause-blue.svg

PerVect

Vectorization of persistence diagrams and approximate Wasserstein distance. This is managed by approximating persistence diagrams with Gaussian mixture models and then measuring the Wasserstein distance between the Gaussian mixtures. As the number of components in mixture model increases the accuracy of the approximation increases accordingly until, with equivalence in the limit.

The library is implemented as a Scikit-learn transformer – taking a list of persistence diagrams (preferably in birth-lifetime format) as input, and transforming it into a vector representation (specifically the component weights for a Gaussian mixture model fit to the union of all the diagrams). Distances can then be computed as Wassterstein distance over a ground-distance matrix provided as an attribute of the transformer. Alternatively UMAP can be used to convert toa lower dimensional Euclidean distance representation.

How to use PerVect

The pervect library inheritis from sklearn classes and can be used as an sklearn transformer.

import pervect
vects = pervect.PersistenceVectorizer().fit_transform(diagrams)

It can also be used in standard sklearn pipelines along with other machine learning tools including clustering and classifiers.

Installation

Requirements:

  • Python >= 3.6

  • scikit-learn

  • umap-learn

  • numba

  • joblib

  • pot

You can install pervect from PyPI with pip:

pip install pervect

For a manual install get this package:

wget https://github.com/scikit-tda/pervect/archive/master.zip
unzip master.zip
rm master.zip
cd pervect-master

Install the requirements

sudo pip install -r requirements.txt

Install the package

pip install .

License

The pervect package is 3-clause BSD licensed.

We would like to note that the pervect package makes heavy use of NumFOCUS sponsored projects, and would not be possible without their support of those projects, so please consider contributing to NumFOCUS.

Contributing

Contributions are more than welcome! There are lots of opportunities for potential projects, so please get in touch if you would like to help out. Everything from code to notebooks to examples and documentation are all equally valuable so please don’t feel you can’t contribute. To contribute please fork the project make your changes and submit a pull request. We will do our best to work through any issues with you and get your code merged into the main branch.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pervect-0.0.2.tar.gz (9.1 kB view details)

Uploaded Source

File details

Details for the file pervect-0.0.2.tar.gz.

File metadata

  • Download URL: pervect-0.0.2.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.6

File hashes

Hashes for pervect-0.0.2.tar.gz
Algorithm Hash digest
SHA256 47c22ad793715414cb40b1e34b550eb030bdde9abdfdc99eda818ec8c942e351
MD5 89e674c38588e7f2ecaad77a366b4979
BLAKE2b-256 286ed8208d280dbe56d87308f8e3114b55dea4a63b762a60115e9164ee5e4239

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page