Utilities related to Jaccard/Tanimoto coefficients.
Project description
jaccard.py 
Utilities related to Jaccard/Tanimoto coefficients.
🗺️ Overview
jaccard.py is a pure-Python package providing Jaccard index computation.
This library only depends on NumPy and is available for all modern Python versions (3.6+).
📋 Features
Agnostic interface using duck-typing: all functions should be available for NumPy arrays, MLX arrays, or PyTorch tensors, unless noted otherwise.
The following functions are implemented:
- Jaccard similarity[1]: measure similarity between boolean vectors,
similar to
scipy.spatial.distance.jaccard. - probabilistic Jaccard similarity[2]: measure similarity between probability vectors while quantifying uncertainty.
- centered Jaccard similarity and Jaccard testing[3]: identify non-random co-occurences between samples with robust statistical testing.
- collision probability Jaccard index [4]: measure similarity between positive indices, using a metric that is scale invariant, sensitive to changes in support, and computable as a collision probability.
🔧 Installing
Install the jaccard package directly from PyPi
which hosts universal wheels that can be installed with pip:
$ pip install jaccard
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
🏗️ Contributing
Contributions are more than welcome! See
CONTRIBUTING.md
for more details.
⚖️ License
This library is provided under the MIT License.
This project was developed by Martin Larralde during his PhD project at the Leiden University Medical Center in the Zeller team.
📚 References
- [1] Jaccard, P. "Étude comparative de la distribution florale dans une portion des Alpes et du Jura." Bulletin de la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901). doi:10.1111/j.1469-8137.1912.tb05611.x
- [2] Martire, I., Da Silva, P. N., Plastino, A., Fabris, F. & Freitas, A. A. "A novel probabilistic Jaccard distance measure for classification of sparse and uncertain data". Proceedings of the 5th Symposium on Knowledge Discovery, Mining and Learning, 81-88 (2017).
- [3] Chung, N. C., Miasojedow, B., Startek, M. & Gambin, A. "Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data". BMC Bioinformatics 20, 644 (2019). doi:10.1186/s12859-019-3118-5
- [4] 1. Moulton, R. & Jiang, Y. "Maximally Consistent Sampling and the Jaccard Index of Probability Distributions". in 2018 IEEE International Conference on Data Mining (ICDM) 347–356 (2018). doi:10.1109/ICDM.2018.00050.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jaccard-0.1.0.tar.gz.
File metadata
- Download URL: jaccard-0.1.0.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd4f8fc4f33de6553805cb23a39a0c946a3f406260e77dad5cda37611c1f9e03
|
|
| MD5 |
450a699745c11fb3c7fb99a00f81d64a
|
|
| BLAKE2b-256 |
bc15be43e4925ce47b3a6843f8a290807bf560ed3d2202459288afc1a07c00df
|
Provenance
The following attestation bundles were made for jaccard-0.1.0.tar.gz:
Publisher:
test.yml on althonos/jaccard.py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jaccard-0.1.0.tar.gz -
Subject digest:
cd4f8fc4f33de6553805cb23a39a0c946a3f406260e77dad5cda37611c1f9e03 - Sigstore transparency entry: 1074918506
- Sigstore integration time:
-
Permalink:
althonos/jaccard.py@0bed131fac015a25c8a93eb1b8cb59250bd937a2 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
test.yml@0bed131fac015a25c8a93eb1b8cb59250bd937a2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file jaccard-0.1.0-py3-none-any.whl.
File metadata
- Download URL: jaccard-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3d6d073fb78badeda847e4e77c01ed034480c5f478956e4b8c1fb75fd8bb798
|
|
| MD5 |
570d7f754ae13179da45e23b878faad2
|
|
| BLAKE2b-256 |
1af53385c47ab4bacd2969ed138853fc60b8ae88318b0f23340fd912cf40d4dc
|
Provenance
The following attestation bundles were made for jaccard-0.1.0-py3-none-any.whl:
Publisher:
test.yml on althonos/jaccard.py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jaccard-0.1.0-py3-none-any.whl -
Subject digest:
b3d6d073fb78badeda847e4e77c01ed034480c5f478956e4b8c1fb75fd8bb798 - Sigstore transparency entry: 1074918508
- Sigstore integration time:
-
Permalink:
althonos/jaccard.py@0bed131fac015a25c8a93eb1b8cb59250bd937a2 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/althonos
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
test.yml@0bed131fac015a25c8a93eb1b8cb59250bd937a2 -
Trigger Event:
push
-
Statement type: