Skip to main content

SLEDgeHammer: Support, Length, Exclusivity and Difference Weigthed for Group Evaluation

Project description

Project Author Python Version License

SLEDgeHammer (SLEDgeH): Support, Length, Exclusivity and Difference Weigthed for Group Evaluation

SLEDgeH is a Python library for evaluating clustering results using a semantic-based approach. Unlike traditional distance-based metrics, this method leverages the semantic relationship between significant frequent patterns identified among cluster items. This internal validation technique is particularly effective for data organized in categorical form.


🔥 Features

  • Semantic Descriptors: Analyze feature support in clusters.
  • Particularization of Descriptors: Refine cluster descriptors using customizable thresholds.
  • SLED Indicators: Evaluate clusters based on Support (S), Length deviation (L), Exclusivity (E), and Descriptor support Difference (D).
  • Customizable Aggregation: Choose from harmonic, geometric, or median aggregation for SLED indicators.

🛠 Installation

Install using pip:

pip install sledgehammer

🚀 Usage

Importing the Library

import numpy as np
from sklearn.cluster import KMeans  # requires: pip install scikit-learn
from sledgehammer import sledgehammer_score, sledgehammer_score_clusters, semantic_descriptors

Example Workflow

# Generate a random binary dataset
X = np.random.randint(0, 2, (100, 5))

# Specify the number of clusters
num_clusters = 3

# Perform K-Means clustering
kmeans = KMeans(n_clusters=num_clusters, random_state=42)
labels = kmeans.fit_predict(X)

# Calculate the SLEDgeH score
average_score = sledgehammer_score(X, labels, aggregation='median')
print(f"Average SLEDgeH Score: {average_score}\n")

# Generate semantic descriptors
report = semantic_descriptors(X, labels, particular_threshold=0.5, report_form=True)

# Print cluster descriptors
for i in range(num_clusters):
    print(f"Cluster {i}:\n{report[i]}\n")

📜 Functions Overview

sledgehammer_score

Computes the average SLEDgeH score for all clusters.

Parameters:

  • X: Binary feature matrix of shape (n_samples, n_features).
  • labels: Cluster labels for each sample.
  • W: Weighting factors for the SLED indicators (default [0.3, 0.1, 0.5, 0.1]).
  • particular_threshold: Threshold for descriptor particularization (None for no particularization).
  • aggregation: Aggregation method ('harmonic', 'geometric', or 'median').

Returns:

  • score: Average SLEDgeH score.

sledgehammer_score_clusters

Computes the SLEDgeH score for individual clusters.

Parameters:

  • Same as sledgehammer_score, with the addition of:
    • aggregation=None: If None, returns scores for each SLED indicator separately.

Returns:

  • scores: Aggregated SLEDgeH scores for each cluster.
  • score_matrix: Individual SLED indicator scores if aggregation=None.

semantic_descriptors

Computes semantic descriptors based on feature support in clusters.

Parameters:

  • X: Binary feature matrix of shape (n_samples, n_features).
  • labels: Cluster labels for each sample.
  • particular_threshold: Threshold for descriptor particularization.
  • report_form: If True, returns descriptors as a sorted dictionary for each cluster.

Returns:

  • descriptors: Matrix with particularized feature support in clusters.
  • report: Sorted dictionary of significant features in each cluster (if report_form=True).

particularize_descriptors

Particularizes descriptors based on support thresholds.

Parameters:

  • descriptors: Feature support matrix of shape (n_clusters, n_features).
  • particular_threshold: Threshold for particularization (default 1.0).

Returns:

  • descriptors: Matrix with particularized support values.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.


🤝 Contributing

We welcome contributions to SLEDgeH! To contribute:

  1. Fork this repository.
  2. Create a new branch for your feature.
  3. Submit a pull request with your changes.

For questions or information, feel free to reach out at: aquinordga@gmail.com.


👨‍💻 Author

Developed by AQUINO, R. D. G. Lattes ORCID Google Scholar


💬 Feedback

Feel free to open an issue or contact me for feedback or feature requests. Your input is highly appreciated!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sledgehammer-1.0.0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sledgehammer-1.0.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file sledgehammer-1.0.0.tar.gz.

File metadata

  • Download URL: sledgehammer-1.0.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sledgehammer-1.0.0.tar.gz
Algorithm Hash digest
SHA256 846649c00d4b50c785b0a1d50d5e97ca5c00f9efc31b79f6e367c9ee1a95ad23
MD5 801c7d08f060ce4dc8c03cc5c2a311ab
BLAKE2b-256 2bf7a0fe6a27f522379938430b637f63def8f1ad72d28b75a7336f59778ca750

See more details on using hashes here.

Provenance

The following attestation bundles were made for sledgehammer-1.0.0.tar.gz:

Publisher: publish.yml on aquinordg/sledgehammer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sledgehammer-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: sledgehammer-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sledgehammer-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cba5bab73660c2c39febe53006ea138da5c83e0f3a8f80ed5e752e088f6988c0
MD5 f6599967a0c9feffb2716cc4315408aa
BLAKE2b-256 7c107280d05d4fc124f28dc2dac57be635ea52ece353c7c9c431400a365b0ea3

See more details on using hashes here.

Provenance

The following attestation bundles were made for sledgehammer-1.0.0-py3-none-any.whl:

Publisher: publish.yml on aquinordg/sledgehammer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page