Exact and approximate silhouette scoring with micro, macro, and weighted cluster averages.
Project description
sil_score
sil-score is a small Python package for exact and fast approximate silhouette scoring.
It extends the usual silhouette workflow with:
- per-sample silhouette scores
- micro-averaged silhouette score
- macro-averaged silhouette score
- cluster-weighted macro silhouette score
- exact vs approximate comparison report
The exact mode uses scikit-learn's silhouette_samples.
The approximate mode uses Euclidean distances to cluster centroids, making it faster but not identical to the classical silhouette definition.
Installation
Install from PyPI:
pip install sil-score
Quick example
import numpy as np
from sil_score import (
sil_samples,
micro_sil_score,
macro_sil_score,
weighted_macro_sil_score,
sil_approximation_report,
)
X = np.array([
[0.0],
[2.0],
[10.0],
[12.0],
])
labels = np.array([0, 0, 1, 1])
samples = sil_samples(X, labels)
micro = micro_sil_score(X, labels)
macro = macro_sil_score(X, labels)
print(samples)
print(micro)
print(macro)
Output:
[0.81818182 0.77777778 0.77777778 0.81818182]
0.797979797979798
0.797979797979798
Functions
sil_samples
sil_samples(X, labels, approximation=False, centers=None)
Computes the silhouette score for each sample.
By default, it computes the exact silhouette values using scikit-learn.
scores = sil_samples(X, labels)
For a faster centroid-based approximation:
scores = sil_samples(X, labels, approximation=True)
You can also pass precomputed cluster centers:
scores = sil_samples(
X,
labels,
approximation=True,
centers=centers,
)
micro_sil_score
micro_sil_score(X, labels, approximation=False, centers=None)
Computes the mean of all sample-level silhouette scores. This is the usual average silhouette score. Larger clusters naturally have more influence because they contain more samples.
# Standard usage
score = micro_sil_score(X, labels)
# Approximate version
score = micro_sil_score(X, labels, approximation=True)
macro_sil_score
macro_sil_score(X, labels, approximation=False, centers=None)
Computes the mean silhouette score inside each cluster, then averages the cluster means equally. This gives every cluster the same importance, regardless of its size.
# Standard usage
score = macro_sil_score(X, labels)
# Approximate version
score = macro_sil_score(X, labels, approximation=True)
weighted_macro_sil_score
weighted_macro_sil_score(X, labels, cluster_weights, approximation=False, centers=None)
Computes a cluster-weighted macro silhouette score. First, it computes the mean silhouette score for each cluster, then combines those cluster means using custom cluster weights.
Using a dictionary:
weights = {
0: 0.2,
1: 0.3,
2: 0.5,
}
score = weighted_macro_sil_score(X, labels, cluster_weights=weights)
Using an array:
weights = [0.2, 0.3, 0.5]
score = weighted_macro_sil_score(X, labels, cluster_weights=weights)
sil_approximation_report
sil_approximation_report(X, labels, centers=None, return_samples=False)
Compares exact silhouette scores with centroid-based approximate scores. It returns(Pearson) correlation and error metrics:
report = sil_approximation_report(X, labels)
print(report)
Example output:
{
"correlation": 0.96,
"mean_absolute_error": 0.03,
"mean_squared_error": 0.002,
"root_mean_squared_error": 0.045,
"max_absolute_error": 0.12,
"mean_error": 0.01,
"mean_exact_score": 0.52,
"mean_approximate_score": 0.53,
"n_samples": 300,
}
Use return_samples=True to also include the exact scores, approximate scores, and per-sample errors.
Exact vs Approximate mode
- Exact mode:
sil_samples(X, labels, approximation=False). Uses the classical silhouette definition based on distances between samples. - Approximate mode:
sil_samples(X, labels, approximation=True). Uses distances from each sample to cluster centroids. This can be significantly faster for larger datasets.
Requirements
sil-score depends on:
- NumPy
- scikit-learn
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sil_score-0.1.5.tar.gz.
File metadata
- Download URL: sil_score-0.1.5.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
244ded5dc80148ce780a7e60b1f70739e01ad5537b5ec04b23c6d97974634b41
|
|
| MD5 |
d5e218a8c2fd258c74935d44b343bfc2
|
|
| BLAKE2b-256 |
9f7e738faa21b6812c1715f6319d6942bb444a6dca90ec7485dada66088b592d
|
Provenance
The following attestation bundles were made for sil_score-0.1.5.tar.gz:
Publisher:
python-publish.yml on semoglou/sil_score
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sil_score-0.1.5.tar.gz -
Subject digest:
244ded5dc80148ce780a7e60b1f70739e01ad5537b5ec04b23c6d97974634b41 - Sigstore transparency entry: 1526357868
- Sigstore integration time:
-
Permalink:
semoglou/sil_score@b49dd25fcf5ec7012462a74b7048fb3fa98ac09b -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/semoglou
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@b49dd25fcf5ec7012462a74b7048fb3fa98ac09b -
Trigger Event:
release
-
Statement type:
File details
Details for the file sil_score-0.1.5-py3-none-any.whl.
File metadata
- Download URL: sil_score-0.1.5-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbdf608c42355c99de9db65f49cc54c75b7328f8374fde7bb5217c7df1ff7f6c
|
|
| MD5 |
9f6efd3077ab89721d649211f76ff99c
|
|
| BLAKE2b-256 |
b3efbfb23278db609a09b70e165116a62682d7f18c098679d33d0effa8fd3ed3
|
Provenance
The following attestation bundles were made for sil_score-0.1.5-py3-none-any.whl:
Publisher:
python-publish.yml on semoglou/sil_score
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sil_score-0.1.5-py3-none-any.whl -
Subject digest:
bbdf608c42355c99de9db65f49cc54c75b7328f8374fde7bb5217c7df1ff7f6c - Sigstore transparency entry: 1526358273
- Sigstore integration time:
-
Permalink:
semoglou/sil_score@b49dd25fcf5ec7012462a74b7048fb3fa98ac09b -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/semoglou
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@b49dd25fcf5ec7012462a74b7048fb3fa98ac09b -
Trigger Event:
release
-
Statement type: