SLEDgeHammer: Support, Length, Exclusivity and Difference Weigthed for Group Evaluation
Project description
SLEDgeHammer (SLEDgeH): Support, Length, Exclusivity and Difference Weigthed for Group Evaluation
SLEDgeH is a Python library for evaluating clustering results using a semantic-based approach. Unlike traditional distance-based metrics, this method leverages the semantic relationship between significant frequent patterns identified among cluster items. This internal validation technique is particularly effective for data organized in categorical form.
🔥 Features
- Semantic Descriptors: Analyze feature support in clusters.
- Particularization of Descriptors: Refine cluster descriptors using customizable thresholds.
- SLED Indicators: Evaluate clusters based on Support (S), Length deviation (L), Exclusivity (E), and Descriptor support Difference (D).
- Customizable Aggregation: Choose from harmonic, geometric, or median aggregation for SLED indicators.
🛠 Installation
Install using pip:
pip install sledgehammer
🚀 Usage
Importing the Library
import numpy as np
from sklearn.cluster import KMeans # requires: pip install scikit-learn
from sledgehammer import sledgehammer_score, sledgehammer_score_clusters, semantic_descriptors
Example Workflow
# Generate a random binary dataset
X = np.random.randint(0, 2, (100, 5))
# Specify the number of clusters
num_clusters = 3
# Perform K-Means clustering
kmeans = KMeans(n_clusters=num_clusters, random_state=42)
labels = kmeans.fit_predict(X)
# Calculate the SLEDgeH score
average_score = sledgehammer_score(X, labels, aggregation='median')
print(f"Average SLEDgeH Score: {average_score}\n")
# Generate semantic descriptors
report = semantic_descriptors(X, labels, particular_threshold=0.5, report_form=True)
# Print cluster descriptors
for i in range(num_clusters):
print(f"Cluster {i}:\n{report[i]}\n")
📜 Functions Overview
sledgehammer_score
Computes the average SLEDgeH score for all clusters.
Parameters:
X: Binary feature matrix of shape(n_samples, n_features).labels: Cluster labels for each sample.W: Weighting factors for the SLED indicators (default[0.3, 0.1, 0.5, 0.1]).particular_threshold: Threshold for descriptor particularization (Nonefor no particularization).aggregation: Aggregation method ('harmonic','geometric', or'median').
Returns:
score: Average SLEDgeH score.
sledgehammer_score_clusters
Computes the SLEDgeH score for individual clusters.
Parameters:
- Same as
sledgehammer_score, with the addition of:aggregation=None: IfNone, returns scores for each SLED indicator separately.
Returns:
scores: Aggregated SLEDgeH scores for each cluster.score_matrix: Individual SLED indicator scores ifaggregation=None.
semantic_descriptors
Computes semantic descriptors based on feature support in clusters.
Parameters:
X: Binary feature matrix of shape(n_samples, n_features).labels: Cluster labels for each sample.particular_threshold: Threshold for descriptor particularization.report_form: IfTrue, returns descriptors as a sorted dictionary for each cluster.
Returns:
descriptors: Matrix with particularized feature support in clusters.report: Sorted dictionary of significant features in each cluster (ifreport_form=True).
particularize_descriptors
Particularizes descriptors based on support thresholds.
Parameters:
descriptors: Feature support matrix of shape(n_clusters, n_features).particular_threshold: Threshold for particularization (default1.0).
Returns:
descriptors: Matrix with particularized support values.
📄 License
This project is licensed under the MIT License. See the LICENSE file for details.
🤝 Contributing
We welcome contributions to SLEDgeH! To contribute:
- Fork this repository.
- Create a new branch for your feature.
- Submit a pull request with your changes.
For questions or information, feel free to reach out at: aquinordga@gmail.com.
👨💻 Author
💬 Feedback
Feel free to open an issue or contact me for feedback or feature requests. Your input is highly appreciated!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sledgehammer-1.0.0.tar.gz.
File metadata
- Download URL: sledgehammer-1.0.0.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
846649c00d4b50c785b0a1d50d5e97ca5c00f9efc31b79f6e367c9ee1a95ad23
|
|
| MD5 |
801c7d08f060ce4dc8c03cc5c2a311ab
|
|
| BLAKE2b-256 |
2bf7a0fe6a27f522379938430b637f63def8f1ad72d28b75a7336f59778ca750
|
Provenance
The following attestation bundles were made for sledgehammer-1.0.0.tar.gz:
Publisher:
publish.yml on aquinordg/sledgehammer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sledgehammer-1.0.0.tar.gz -
Subject digest:
846649c00d4b50c785b0a1d50d5e97ca5c00f9efc31b79f6e367c9ee1a95ad23 - Sigstore transparency entry: 1829554866
- Sigstore integration time:
-
Permalink:
aquinordg/sledgehammer@4f30cdf4ee1cd520681cac9c56469fab64a588f5 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/aquinordg
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4f30cdf4ee1cd520681cac9c56469fab64a588f5 -
Trigger Event:
release
-
Statement type:
File details
Details for the file sledgehammer-1.0.0-py3-none-any.whl.
File metadata
- Download URL: sledgehammer-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cba5bab73660c2c39febe53006ea138da5c83e0f3a8f80ed5e752e088f6988c0
|
|
| MD5 |
f6599967a0c9feffb2716cc4315408aa
|
|
| BLAKE2b-256 |
7c107280d05d4fc124f28dc2dac57be635ea52ece353c7c9c431400a365b0ea3
|
Provenance
The following attestation bundles were made for sledgehammer-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on aquinordg/sledgehammer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sledgehammer-1.0.0-py3-none-any.whl -
Subject digest:
cba5bab73660c2c39febe53006ea138da5c83e0f3a8f80ed5e752e088f6988c0 - Sigstore transparency entry: 1829555061
- Sigstore integration time:
-
Permalink:
aquinordg/sledgehammer@4f30cdf4ee1cd520681cac9c56469fab64a588f5 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/aquinordg
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4f30cdf4ee1cd520681cac9c56469fab64a588f5 -
Trigger Event:
release
-
Statement type: