Skip to main content

Distances and divergences between distributions implemented in python.

Project description

travis coveralls sonar_quality sonar_maintainability Maintainability Test Coverage pip

Distances and divergences between dictionaries implemented in python 3.6.

In the complexity notations, n is len(a) and m is len(b).

The samples are dictionaries generated by the test utilities here.

How do I get it?

Just type into your terminal:

pip install dictances

Basic example

For each metric, an example is present in the folder examples. Here’s a basic example for those too lazy to click links (like me).

import random
from dictances import cosine, euclidean, canberra
random.seed(42)  # for reproducibility

# Simple function to generate the example dictionaries


def generate_example_dict(n=1000):
    return {random.randint(0, 1000): random.uniform(0, 1000) for i in range(n)}


a, b = generate_example_dict(), generate_example_dict()

print(cosine(a, b))
# >>> 0.52336690346601

print(euclidean(a, b))
# >>> 15119.400349404095

print(canberra(a, b))
# >>> 624.9088876554047

Metrics table

Metric name

Usage example

Average time on sample

Complexity

Euclidean distance

euclidean

90.4 µs ± 2.5 µs

On+m

Squared variation

squared_variation

90.8 µs ± 1.43

On+m

Total variation

total_variation

92.3 µs ± 1.28 µs

On+m

Nth variation

nth_variation

91.1 µs ± 1.2 µs

On+m

Manhattan distance

manhattan

92.7 µs ± 1.43 µs

On+m

Mean absolute error

mae

92.3 µs ± 1.28 µs

On+m

Mean squared error

mse

91.1 µs ± 1.2 µs

On+m

Chebyshev distance

chebyshev

101 µs ± 2.14 µs

On+m

Minkowski distance

minkowsky

91.1 µs ± 2.05 µs

On+m

Canberra distance

canberra

71.8 µs ± 1.95 µs

On+m

Cosine distance

cosine

61.3 µs ± 835 ns

On+m

Pearson distance

pearson

46.9 µs ± 1.23 µs

On+m

Hamming distance

hamming

28.7 µs ± 784 ns

Omin

Normalized Total Variation

normal_total_variation

34.6 µs ± 543 ns

Omin

Kullback Leibler divergence

kullback_leibler

24 µs ± 587 ns

Omin

Jensen Shannon divergence

jensen_shannon

38.2 µs ± 1.18 µs

Omin

Bhattacharyya distance

bhattacharyya

32.7 µs ± 655 ns

Omin

Hellinger distance

hellinger

42 µs ± 467 ns

Omin

Test computer specifications

The computer on which the metrics where timed had the following specifications:

Computer specifications

Model Name

MacBook Pro

Processor Name

Intel Core i7

Processor Speed

2.3 GHz

Number of Processors

1

Total Number of Cores

4

L2 Cache (per Core)

256 KB

L3 Cache

6 MB

Memory

16 GB

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dictances-1.5.0.tar.gz (9.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page