Skip to main content

Mutual-information feature screening measurement for Swarmauri classification datasets using scikit-learn.

Project description

Swarmauri Logo

PyPI - Downloads Hits PyPI - Python Version PyPI - License PyPI - swarmauri_measurement_mutualinformation Discord

Swarmauri Measurement Mutual Information

swarmauri_measurement_mutualinformation is the Swarmauri feature-signal measurement for supervised classification datasets. It wraps sklearn.feature_selection.mutual_info_classif and returns the average mutual information across all non-target columns in a Pandas DataFrame.

Why Use Swarmauri Measurement Mutual Information

  • Estimate how strongly each feature depends on a discrete target before model training.
  • Reduce low-signal columns earlier in a Swarmauri data or evaluation pipeline.
  • Reuse a standard MeasurementBase component instead of hand-wiring feature scoring logic.
  • Pair feature screening with downstream metrics, vectorization, or model selection flows.

FAQ

What input does this measurement expect?
A Pandas DataFrame plus the name of the target column.

What does it return?
A single float: the mean of the per-feature mutual-information scores produced by mutual_info_classif.

Does it return one score per feature?
No. This component averages the feature scores. If you need per-feature values, call mutual_info_classif directly.

What units are the underlying scores in?
Scikit-learn documents mutual_info_classif outputs in natural-log units.

Features

  • Supervised mutual-information scoring for discrete targets.
  • Automatic exclusion of the target column from the feature matrix.
  • Returns one aggregate score that is easy to log, compare, or threshold.
  • Uses the Swarmauri measurement interface for pipeline compatibility.
  • Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.

Installation

uv add swarmauri_measurement_mutualinformation
pip install swarmauri_measurement_mutualinformation

Usage

import pandas as pd
from swarmauri_measurement_mutualinformation import MutualInformationMeasurement

frame = pd.DataFrame(
    {
        "feature_a": [0, 1, 1, 0, 1, 0],
        "feature_b": [5.1, 5.0, 4.9, 5.2, 5.1, 5.0],
        "target": [0, 1, 1, 0, 1, 0],
    }
)

measurement = MutualInformationMeasurement()
score = measurement.calculate(frame, target_column="target")
print(score)

Examples

Screen a small classification dataset

import pandas as pd
from swarmauri_measurement_mutualinformation import MutualInformationMeasurement

data = pd.DataFrame(
    {
        "clicked_email": [1, 0, 1, 1, 0, 0],
        "days_active": [4, 2, 5, 6, 1, 2],
        "plan_tier": [2, 1, 2, 3, 1, 1],
        "converted": [1, 0, 1, 1, 0, 0],
    }
)

measurement = MutualInformationMeasurement()
print(measurement.calculate(data, target_column="converted"))

Inspect per-feature scores directly

from sklearn.feature_selection import mutual_info_classif

X = frame.drop(columns=["target"])
y = frame["target"]

scores = mutual_info_classif(X, y)
for column, score in zip(X.columns, scores):
    print(column, score)

Related Packages

Swarmauri Foundations

More Documentation

Best Practices

  • Encode categorical columns numerically before calling this measurement.
  • Remove or impute missing values before scoring.
  • Use a stable preprocessing pipeline when comparing MI across experiments.
  • Inspect the raw per-feature scores if feature-level ranking matters more than a single aggregate summary.

License

This project is licensed under the Apache-2.0 License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file swarmauri_measurement_mutualinformation-0.11.0.dev1.tar.gz.

File metadata

  • Download URL: swarmauri_measurement_mutualinformation-0.11.0.dev1.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_measurement_mutualinformation-0.11.0.dev1.tar.gz
Algorithm Hash digest
SHA256 ca3febb5a0a15aaf2743fa841b98caa6827550c24032680fc5051bec045b15db
MD5 3116bd9828162431243b5c5937163d3c
BLAKE2b-256 e29bd840f412851a01bf469bff3fa26054f33ec3269a4b00350b77269c8fc77e

See more details on using hashes here.

File details

Details for the file swarmauri_measurement_mutualinformation-0.11.0.dev1-py3-none-any.whl.

File metadata

  • Download URL: swarmauri_measurement_mutualinformation-0.11.0.dev1-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_measurement_mutualinformation-0.11.0.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 7d32e2c4445bd6b3d5512aed13d82585547a8d4e88b829ee2c7029100b33b949
MD5 9484731c6c60a066c61785d6a2d2b29a
BLAKE2b-256 e36b58640bcb832dcea2734cc24f8834dcafed9ea50ad9ebaf322484de2e2338

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page