Skip to main content

A Monte Carlo approximation to the adjusted and standardized mutual information for faster clustering comparisons

Project description

codecov

FastAMI

A Monte Carlo approximation to the adjusted and standardized mutual information for faster clustering comparisons. You can use this package as a drop-in replacement for sklearn.metrics.adjusted_mutual_info_score, when the exact calculation is too slow, i.e. because of large datasets and large numbers of clusters.

Installation

fastami requires Python >=3.8. You can install fastami via pip from PyPI:

pip install fastami

Usage Examples

FastAMI

You can use FastAMI as you would use adjusted_mutual_info_score from scikit-learn:

from fastami import adjusted_mutual_info_mc

labels_true = [0, 0, 1, 1, 2]
labels_pred = [0, 1, 1, 2, 2]

ami, ami_error = adjusted_mutual_info_mc(labels_true, labels_pred)

# Output: AMI = -0.255 +- 0.008
print(f"AMI = {ami:.3f} +- {ami_error:.3f}")

Note that the output may vary a little bit, due to the nature of the Monte Carlo approach. If you would like to ensure reproducible results, use the seed argument. By default, the algorithm terminates when it reaches an accuracy of 0.01. You can adjust this behavior using the accuracy_goal argument.

FastSMI

FastSMI works similarly:

from fastami import standardized_mutual_info_mc

labels_true = [0, 0, 1, 1, 2]
labels_pred = [0, 1, 1, 2, 2]

smi, smi_error = standardized_mutual_info_mc(labels_true, labels_pred)

# Output: SMI = -0.673 +- 0.035
print(f"SMI = {smi:.3f} +- {smi_error:.3f}")

While FastSMI is usually faster than an exact calculation of the SMI, it is still orders of magnitude slower than FastAMI. Since the SMI is not confined to the interval [-1,1] like the AMI, the SMI by default terminates at a given absolute or relative error of at least 0.1, whichever is reached first. You can adjust this behavior using the precision_goal argument.

Citing FastAMI

If you use fastami in your research work, please cite the corresponding paper (will probably be published by March 2023):

Klede et al., (2023). FastAMI - A Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics. Proceedings of the AAAI Conference on Artificial Intelligence.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastami-0.2.1.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

fastami-0.2.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file fastami-0.2.1.tar.gz.

File metadata

  • Download URL: fastami-0.2.1.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.11.2 Linux/5.15.0-1033-azure

File hashes

Hashes for fastami-0.2.1.tar.gz
Algorithm Hash digest
SHA256 8b7367f835103c9f8bd743487ab99d4b0cfdb167490ad061865c1ccd90cc2737
MD5 540e81499dd711268dcfa3727f73bd47
BLAKE2b-256 5e6374be1be0028bdc58b7bc6b483617bc9776672b349848308aab5a5d33bf83

See more details on using hashes here.

File details

Details for the file fastami-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: fastami-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.11.2 Linux/5.15.0-1033-azure

File hashes

Hashes for fastami-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f75efbe044071a5a1070650906056081d73b6ee5bc42129ab9ed696de4662453
MD5 a437da1517bea60c35550d101283c7fd
BLAKE2b-256 d1e0eb7f3dcc17e295ea37c8e1e38f168389c9da54fc0645f8428bd39c3868da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page