Mutual-information feature screening measurement for Swarmauri classification datasets using scikit-learn.
Project description
Swarmauri Measurement Mutual Information
swarmauri_measurement_mutualinformation is the Swarmauri feature-signal
measurement for supervised classification datasets. It wraps
sklearn.feature_selection.mutual_info_classif and returns the average mutual
information across all non-target columns in a Pandas DataFrame.
Why Use Swarmauri Measurement Mutual Information
- Estimate how strongly each feature depends on a discrete target before model training.
- Reduce low-signal columns earlier in a Swarmauri data or evaluation pipeline.
- Reuse a standard
MeasurementBasecomponent instead of hand-wiring feature scoring logic. - Pair feature screening with downstream metrics, vectorization, or model selection flows.
FAQ
What input does this measurement expect?
A PandasDataFrameplus the name of the target column.
What does it return?
A single float: the mean of the per-feature mutual-information scores produced bymutual_info_classif.
Does it return one score per feature?
No. This component averages the feature scores. If you need per-feature values, callmutual_info_classifdirectly.
What units are the underlying scores in?
Scikit-learn documentsmutual_info_classifoutputs in natural-log units.
Features
- Supervised mutual-information scoring for discrete targets.
- Automatic exclusion of the target column from the feature matrix.
- Returns one aggregate score that is easy to log, compare, or threshold.
- Uses the Swarmauri measurement interface for pipeline compatibility.
- Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.
Installation
uv add swarmauri_measurement_mutualinformation
pip install swarmauri_measurement_mutualinformation
Usage
import pandas as pd
from swarmauri_measurement_mutualinformation import MutualInformationMeasurement
frame = pd.DataFrame(
{
"feature_a": [0, 1, 1, 0, 1, 0],
"feature_b": [5.1, 5.0, 4.9, 5.2, 5.1, 5.0],
"target": [0, 1, 1, 0, 1, 0],
}
)
measurement = MutualInformationMeasurement()
score = measurement.calculate(frame, target_column="target")
print(score)
Examples
Screen a small classification dataset
import pandas as pd
from swarmauri_measurement_mutualinformation import MutualInformationMeasurement
data = pd.DataFrame(
{
"clicked_email": [1, 0, 1, 1, 0, 0],
"days_active": [4, 2, 5, 6, 1, 2],
"plan_tier": [2, 1, 2, 3, 1, 1],
"converted": [1, 0, 1, 1, 0, 0],
}
)
measurement = MutualInformationMeasurement()
print(measurement.calculate(data, target_column="converted"))
Inspect per-feature scores directly
from sklearn.feature_selection import mutual_info_classif
X = frame.drop(columns=["target"])
y = frame["target"]
scores = mutual_info_classif(X, y)
for column, score in zip(X.columns, scores):
print(column, score)
Related Packages
- swarmauri_measurement_tokencountestimator
- swarmauri_metric_hamming
- swarmauri_measurement_mutualinformation
Swarmauri Foundations
More Documentation
Best Practices
- Encode categorical columns numerically before calling this measurement.
- Remove or impute missing values before scoring.
- Use a stable preprocessing pipeline when comparing MI across experiments.
- Inspect the raw per-feature scores if feature-level ranking matters more than a single aggregate summary.
License
This project is licensed under the Apache-2.0 License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file swarmauri_measurement_mutualinformation-0.11.0.dev1.tar.gz.
File metadata
- Download URL: swarmauri_measurement_mutualinformation-0.11.0.dev1.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca3febb5a0a15aaf2743fa841b98caa6827550c24032680fc5051bec045b15db
|
|
| MD5 |
3116bd9828162431243b5c5937163d3c
|
|
| BLAKE2b-256 |
e29bd840f412851a01bf469bff3fa26054f33ec3269a4b00350b77269c8fc77e
|
File details
Details for the file swarmauri_measurement_mutualinformation-0.11.0.dev1-py3-none-any.whl.
File metadata
- Download URL: swarmauri_measurement_mutualinformation-0.11.0.dev1-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d32e2c4445bd6b3d5512aed13d82585547a8d4e88b829ee2c7029100b33b949
|
|
| MD5 |
9484731c6c60a066c61785d6a2d2b29a
|
|
| BLAKE2b-256 |
e36b58640bcb832dcea2734cc24f8834dcafed9ea50ad9ebaf322484de2e2338
|