Skip to main content

Universal AutoML + Feature Engineering + Explainability Library

Project description

๐Ÿง  featuremind v3.1.0

Universal AutoML ยท Feature Engineering ยท Reliability Framework

Python Version Leakage Guard License Status


๐Ÿ“Œ What is featuremind?

featuremind is a one-line AutoML library that handles the complete machine learning pipeline โ€” from raw CSV to production-ready model โ€” with built-in reliability checking, leakage detection, and feature engineering.

import featuremind as fm
fm.analyze("data.csv")

That's it. One line. Full analysis, model selection, feature suggestions, SHAP importance, leakage check, and HTML report โ€” all automated.


๐Ÿงช Tested Datasets

featuremind v3.1 has been verified on:

Dataset Type Score Notes
Telecom Churn (7,043 rows) Classification 85.7% F1 โœ… Stable, well-balanced
Credit Card Fraud (284,807 rows) Classification ~99% F1 โš ๏ธ High score due to PCA-transformed separable data
Heart Failure Medical Classification ~80% Accuracy โœ… Works
House Prices Regression Rยฒ reported โœ… Works
Generic CSVs Auto-detected Auto-detected โœ… Works

๐Ÿš€ Key Features

๐Ÿค– Auto ML Pipeline

  • Loads and cleans any CSV automatically
  • Detects target column, task type (classification / regression), and data issues
  • Trains 6 models: LogisticRegression, RandomForest, GradientBoosting, XGBoost, LightGBM, CatBoost
  • Picks best model using cross-validation
  • Auto hyperparameter tuning (RandomizedSearchCV)

๐Ÿ›ก๏ธ Leakage Guard (Core Feature)

  • Detects if any feature formula references the target column
  • Flags columns with suspiciously high correlation with target (>0.95)
  • Smart ID detection (non-generalizable columns)
  • Warns user before model training (no silent failures)

๐Ÿ” Reliability Engine

  • Detects unrealistic scores (>0.98)

  • Adjusts confidence level automatically:

    • 0.99 โ†’ Low confidence โŒ

    • 0.98 โ†’ Medium โš ๏ธ

  • Highlights possible issues:

    • Data leakage
    • Overfitting
    • Sampling bias

โš–๏ธ Class Imbalance Handling

  • Detects imbalance automatically
  • Applies SMOTE (if available)
  • Falls back to class weights
  • Switches evaluation metric to F1 when needed

๐Ÿ“Š SHAP Explainability

  • Computes SHAP values for model explainability
  • Displays top features influencing predictions
  • Helps identify real business drivers

๐Ÿ”ฌ Feature Engineering (Multi-layer)

  • Domain-aware features: Telecom ยท Medical ยท Real Estate ยท Finance ยท HR
  • Interactions, ratios, log transforms, polynomial features
  • Only surfaces features that improve performance

๐Ÿ—๏ธ Production Pipeline

  • Save trained model + preprocessing pipeline
  • Load and predict on new/unseen data
  • Handles missing columns and unseen categories

๐Ÿ† Experiment Tracking

  • Logs every run automatically
  • Leaderboard of models and scores
  • Export results to CSV

๐ŸŒ REST API (Optional)

  • FastAPI-based prediction server
  • Ready-to-use endpoints for deployment

๐Ÿ†š Why featuremind?

Capability featuremind Typical AutoML Tools
One-line usage โœ… โŒ
Leakage detection โœ… โŒ
Reliability scoring โœ… โŒ
SHAP explainability โœ… โš ๏ธ
Production pipeline โœ… โœ…

๐Ÿ“ฆ Installation

# Install from local
pip install -e .

# (Recommended) Install advanced ML libraries
pip install xgboost lightgbm catboost shap imbalanced-learn

# Optional API support
pip install fastapi uvicorn python-multipart

๐Ÿ”œ PyPI release planned:

pip install featuremind

๐Ÿš€ Quick Start

import featuremind as fm

# Analyze dataset
fm.analyze("data.csv")

# Leakage check
fm.check_leakage("data.csv", target="Churn")

# Train + save pipeline
pipeline = fm.train("data.csv", target="Churn")
pipeline.save("churn_pipeline")

# Load + predict
pipeline = fm.load_pipeline("churn_pipeline")
results = pipeline.predict_df(new_data)

# Experiment tracking
fm.get_tracker().leaderboard()

# Optional API
fm.serve("churn_pipeline/", port=8000)

๐Ÿข Example Use Case

A company can use featuremind to:

  1. Upload raw dataset (CSV)
  2. Run fm.analyze()
  3. Identify key drivers using SHAP
  4. Validate model reliability
  5. Deploy pipeline via API

โžก๏ธ Reduces manual ML workflow from days to minutes.


๐Ÿ“ Project Structure

featuremind_project/
โ”‚
โ”œโ”€โ”€ featuremind/
โ”‚   โ”œโ”€โ”€ analyzer.py
โ”‚   โ”œโ”€โ”€ feature_engineer.py
โ”‚   โ”œโ”€โ”€ evaluator.py
โ”‚   โ”œโ”€โ”€ leakage_guard.py
โ”‚   โ”œโ”€โ”€ importance.py
โ”‚   โ”œโ”€โ”€ reporter.py
โ”‚   โ”œโ”€โ”€ html_reporter.py
โ”‚   โ”œโ”€โ”€ insights.py
โ”‚   โ”œโ”€โ”€ pipeline.py
โ”‚   โ”œโ”€โ”€ tracker.py
โ”‚   โ””โ”€โ”€ api.py
โ”‚
โ”œโ”€โ”€ setup.py
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ test.py
โ””โ”€โ”€ README.md

โš ๏ธ Notes

  • High accuracy (>0.98) may indicate:

    • Data leakage
    • Highly separable datasets
    • Sampling bias
  • Always validate models on unseen data.


๐Ÿ“Š Output Files

  • featuremind_report.html โ†’ Full analysis report
  • featuremind_report.png โ†’ Feature visualization
  • enhanced_data.csv โ†’ Dataset with engineered features
  • featuremind_experiments.csv โ†’ Experiment logs
  • pipeline/ โ†’ Saved production model

๐Ÿ’ก Use Cases

  • Telecom churn prediction
  • Fraud detection
  • Healthcare predictions
  • Real estate pricing
  • HR analytics
  • Any tabular ML problem

๐Ÿ”ฎ Roadmap

  • Time-series support
  • Deep learning integration
  • Streamlit dashboard
  • Cloud deployment

๐Ÿ“„ License

MIT License


๐Ÿ‘ฉโ€๐Ÿ’ป Author

Niveditha โ€” Aspiring Data Scientist & ML Engineer


โญ If this project helps you, consider giving it a star!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

featuremind-3.1.0.tar.gz (62.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

featuremind-3.1.0-py3-none-any.whl (65.0 kB view details)

Uploaded Python 3

File details

Details for the file featuremind-3.1.0.tar.gz.

File metadata

  • Download URL: featuremind-3.1.0.tar.gz
  • Upload date:
  • Size: 62.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for featuremind-3.1.0.tar.gz
Algorithm Hash digest
SHA256 8aef74ba367c6d46e518351435145aa50ac34107fed20c7c3c48d0ac092adaae
MD5 256d87fce0b6a1676070ff80dd25c99a
BLAKE2b-256 4096a4cef00930fe84f03ab04487d02fa01f249914bd621df9c30340ec6c2e48

See more details on using hashes here.

File details

Details for the file featuremind-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: featuremind-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 65.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for featuremind-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e353a2b530484a952f3b24cea588bf6005bbccc7d7180fae0a6a43aeff64de6a
MD5 d4eb59cc3268a6c443f9464fb63cfa34
BLAKE2b-256 b0b6247770514b828aebc5983160672d8b32c2fc9c6e66fbbbd2cdcbe597bac8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page