Skip to main content

Utilities related to Jaccard/Tanimoto coefficients.

Project description

jaccard.py Stars

Utilities related to Jaccard/Tanimoto coefficients.

Actions Coverage License PyPI Bioconda Wheel Python Versions Python Implementations Source GitHub issues Docs Changelog Downloads

🗺️ Overview

jaccard.py is a pure-Python package providing Jaccard index computation.

This library only depends on NumPy and is available for all modern Python versions (3.6+).

📋 Features

Agnostic interface using duck-typing: all functions should be available for NumPy arrays, MLX arrays, or PyTorch tensors, unless noted otherwise.

The following functions are implemented:

  • Jaccard similarity[1]: measure similarity between boolean vectors, similar to scipy.spatial.distance.jaccard.
  • probabilistic Jaccard similarity[2]: measure similarity between probability vectors while quantifying uncertainty.
  • centered Jaccard similarity and Jaccard testing[3]: identify non-random co-occurences between samples with robust statistical testing.
  • collision probability Jaccard index [4]: measure similarity between positive indices, using a metric that is scale invariant, sensitive to changes in support, and computable as a collision probability.

🔧 Installing

Install the jaccard package directly from PyPi which hosts universal wheels that can be installed with pip:

$ pip install jaccard

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⚖️ License

This library is provided under the MIT License.

This project was developed by Martin Larralde during his PhD project at the Leiden University Medical Center in the Zeller team.

📚 References

  • [1] Jaccard, P. "Étude comparative de la distribution florale dans une portion des Alpes et du Jura." Bulletin de la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901). doi:10.1111/j.1469-8137.1912.tb05611.x
  • [2] Martire, I., Da Silva, P. N., Plastino, A., Fabris, F. & Freitas, A. A. "A novel probabilistic Jaccard distance measure for classification of sparse and uncertain data". Proceedings of the 5th Symposium on Knowledge Discovery, Mining and Learning, 81-88 (2017).
  • [3] Chung, N. C., Miasojedow, B., Startek, M. & Gambin, A. "Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data". BMC Bioinformatics 20, 644 (2019). doi:10.1186/s12859-019-3118-5
  • [4] 1. Moulton, R. & Jiang, Y. "Maximally Consistent Sampling and the Jaccard Index of Probability Distributions". in 2018 IEEE International Conference on Data Mining (ICDM) 347–356 (2018). doi:10.1109/ICDM.2018.00050.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jaccard-0.1.0.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jaccard-0.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file jaccard-0.1.0.tar.gz.

File metadata

  • Download URL: jaccard-0.1.0.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jaccard-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cd4f8fc4f33de6553805cb23a39a0c946a3f406260e77dad5cda37611c1f9e03
MD5 450a699745c11fb3c7fb99a00f81d64a
BLAKE2b-256 bc15be43e4925ce47b3a6843f8a290807bf560ed3d2202459288afc1a07c00df

See more details on using hashes here.

Provenance

The following attestation bundles were made for jaccard-0.1.0.tar.gz:

Publisher: test.yml on althonos/jaccard.py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file jaccard-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: jaccard-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jaccard-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b3d6d073fb78badeda847e4e77c01ed034480c5f478956e4b8c1fb75fd8bb798
MD5 570d7f754ae13179da45e23b878faad2
BLAKE2b-256 1af53385c47ab4bacd2969ed138853fc60b8ae88318b0f23340fd912cf40d4dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for jaccard-0.1.0-py3-none-any.whl:

Publisher: test.yml on althonos/jaccard.py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page