A Monte Carlo approximation to the adjusted and standardized mutual information for faster clustering comparisons
Project description
FastAMI
A Monte Carlo approximation to the adjusted and standardized mutual information for faster clustering comparisons. You can use this package as a drop-in replacement for sklearn.metrics.adjusted_mutual_info_score
, when the exact calculation is too slow, i.e. because of large datasets and large numbers of clusters.
Installation
fastami
requires Python >=3.8. You can install fastami
via pip from PyPI:
pip install fastami
Usage Examples
FastAMI
You can use FastAMI as you would use adjusted_mutual_info_score
from scikit-learn
:
from fastami import adjusted_mutual_info_mc
labels_true = [0, 0, 1, 1, 2]
labels_pred = [0, 1, 1, 2, 2]
ami, ami_error = adjusted_mutual_info_mc(labels_true, labels_pred)
# Output: AMI = -0.255 +- 0.008
print(f"AMI = {ami:.3f} +- {ami_error:.3f}")
Note that the output may vary a little bit, due to the nature of the Monte Carlo approach. If you would like to ensure reproducible results, use the seed
argument. By default, the algorithm terminates when it reaches an accuracy of 0.01
. You can adjust this behavior using the accuracy_goal
argument.
FastSMI
FastSMI works similarly:
from fastami import standardized_mutual_info_mc
labels_true = [0, 0, 1, 1, 2]
labels_pred = [0, 1, 1, 2, 2]
smi, smi_error = standardized_mutual_info_mc(labels_true, labels_pred)
# Output: SMI = -0.673 +- 0.035
print(f"SMI = {smi:.3f} +- {smi_error:.3f}")
While FastSMI is usually faster than an exact calculation of the SMI, it is still orders of magnitude slower than FastAMI. Since the SMI is not confined to the interval [-1,1]
like the AMI, the SMI by default terminates at a given absolute or relative error of at least 0.1
, whichever is reached first. You can adjust this behavior using the precision_goal
argument.
Citing FastAMI
If you use fastami
in your research work, please cite the corresponding paper (will probably be published by March 2023):
Klede et al., (2023). FastAMI - A Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics. Proceedings of the AAAI Conference on Artificial Intelligence.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fastami-0.2.1.tar.gz
.
File metadata
- Download URL: fastami-0.2.1.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.0 CPython/3.11.2 Linux/5.15.0-1033-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b7367f835103c9f8bd743487ab99d4b0cfdb167490ad061865c1ccd90cc2737 |
|
MD5 | 540e81499dd711268dcfa3727f73bd47 |
|
BLAKE2b-256 | 5e6374be1be0028bdc58b7bc6b483617bc9776672b349848308aab5a5d33bf83 |
File details
Details for the file fastami-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: fastami-0.2.1-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.0 CPython/3.11.2 Linux/5.15.0-1033-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f75efbe044071a5a1070650906056081d73b6ee5bc42129ab9ed696de4662453 |
|
MD5 | a437da1517bea60c35550d101283c7fd |
|
BLAKE2b-256 | d1e0eb7f3dcc17e295ea37c8e1e38f168389c9da54fc0645f8428bd39c3868da |