Skip to main content

Python implementation of binary similarity and distance measures.

Project description

build codecov PyPI version License: MIT Code style: black

binsdpy - binary similarity and distance measures

Python implementation of binary similarity (see [1]) and distance measures (see [2]). The bitsets (immutable ordered set data type) and numpy.ndarray are suported as feature vectors.

Example

Example based on bitsets:

from bitsets import bitset
from binsdpy.similarity import jaccard
from binsdpy.distance import euclid

Colors = bitset("Colors", ("red", "blue", "green", "yellow"))

a = Colors.frommembers(["red", "blue"])
b = Colors.frommembers(["red", "yellow"])

jaccard(a, b)
# > 0.3333333333333333
euclid(a, b)
# > 1.4142135623730951

Example based on np.ndarray:

import numpy as np
from binsdpy.similarity import jaccard
from binsdpy.distance import euclid

a = np.array([1, 1, 0, 0], dtype=bool)
b = np.array([1, 0, 0, 1], dtype=bool)

jaccard(a, b)
# > 0.3333333333333333
euclid(a, b)
# > 1.4142135623730951

Installation

Package is avaliable in alpha version via pip.

$ pip install binsdpy

Dependencies

binsdpy requires:

  • Python (>= 3.6)
  • bitset
  • numpy

Reference

[1] Brusco, M., Cradit, J. D., & Steinley, D. (2021). A comparison of 71 binary similarity coefficients: The effect of base rates. Plos one, 16(4), e0247751. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0247751

[2] Choi, S. S., Cha, S. H., & Tappert, C. C. (2010). A survey of binary similarity and distance measures. Journal of systemics, cybernetics and informatics, 8(1), 43-48. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.352.6123&rep=rep1&type=pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

binsdpy-0.2.4.tar.gz (12.5 kB view details)

Uploaded Source

File details

Details for the file binsdpy-0.2.4.tar.gz.

File metadata

  • Download URL: binsdpy-0.2.4.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.12

File hashes

Hashes for binsdpy-0.2.4.tar.gz
Algorithm Hash digest
SHA256 38786053908c5fd3ee037e5ac64b701116a18015d28b1d4f4a2d83a4c8e4aea2
MD5 4c1a023b1857b5bf059b109334234780
BLAKE2b-256 acef57962a8f3e8118fd5a88c7efa5cfc9267b7a84045d34aa91ad8a4bd5ec25

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page