ML Pipeline Automation Framework - Chain together data processing, model training, and deployment with minimal code
Project description
MLFCrafter
ML Pipeline Automation Framework - Chain together data processing, model training, and deployment with minimal code
⭐ If you find MLFCrafter useful, please consider starring this repository!
Your support helps us continue developing and improving MLFCrafter for the ML community.
What is MLFCrafter?
MLFCrafter is a Python framework that simplifies machine learning pipeline creation through chainable "crafter" components. Build, train, and deploy ML models with minimal code and maximum flexibility.
Key Features
- 🔗 Chainable Architecture - Connect multiple processing steps seamlessly
- 📊 Smart Data Handling - Automatic data ingestion from CSV, Excel, JSON
- 🧹 Intelligent Cleaning - Multiple strategies for missing value handling
- 📏 Flexible Scaling - MinMax, Standard, and Robust scaling options
- 🤖 Multiple Models - Random Forest, XGBoost, Logistic Regression support
- 📈 Comprehensive Metrics - Accuracy, Precision, Recall, F1-Score
- 💾 Easy Deployment - One-click model saving with metadata
- 🔄 Context-Based - Seamless data flow between pipeline steps
Quick Start
Installation
pip install mlfcrafter
Basic Usage
from mlfcrafter import MLFChain, DataIngestCrafter, CleanerCrafter, ScalerCrafter, ModelCrafter, ScorerCrafter, DeployCrafter
# Create ML pipeline in one line
chain = MLFChain(
DataIngestCrafter(data_path="data/iris.csv"),
CleanerCrafter(strategy="auto"),
ScalerCrafter(scaler_type="standard"),
ModelCrafter(model_name="random_forest"),
ScorerCrafter(),
DeployCrafter()
)
# Run entire pipeline
results = chain.run(target_column="species")
print(f"Test Score: {results['test_score']:.4f}")
Advanced Configuration
chain = MLFChain(
DataIngestCrafter(data_path="data/titanic.csv", source_type="csv"),
CleanerCrafter(strategy="mean", str_fill="Unknown"),
ScalerCrafter(scaler_type="minmax", columns=["age", "fare"]),
ModelCrafter(
model_name="xgboost",
model_params={"n_estimators": 200, "max_depth": 6},
test_size=0.25
),
ScorerCrafter(),
DeployCrafter(model_path="models/titanic_model.joblib")
)
results = chain.run(target_column="survived")
Components (Crafters)
DataIngestCrafter
Loads data from various file formats:
DataIngestCrafter(
data_path="path/to/data.csv",
source_type="auto" # auto, csv, excel, json
)
CleanerCrafter
Handles missing values intelligently:
CleanerCrafter(
strategy="auto", # auto, mean, median, mode, drop, constant
str_fill="missing", # Fill value for strings
int_fill=0.0 # Fill value for numbers
)
ScalerCrafter
Scales numerical features:
ScalerCrafter(
scaler_type="standard", # standard, minmax, robust
columns=["age", "income"] # Specific columns or None for all numeric
)
ModelCrafter
Trains ML models:
ModelCrafter(
model_name="random_forest", # random_forest, xgboost, logistic_regression
model_params={"n_estimators": 100},
test_size=0.2,
stratify=True
)
ScorerCrafter
Calculates performance metrics:
ScorerCrafter(
metrics=["accuracy", "precision", "recall", "f1"] # Default: all metrics
)
DeployCrafter
Saves trained models:
DeployCrafter(
model_path="model.joblib",
save_format="joblib", # joblib or pickle
include_scaler=True,
include_metadata=True
)
Alternative Usage Patterns
Step-by-Step Building
chain = MLFChain()
chain.add_crafter(DataIngestCrafter(data_path="data.csv"))
chain.add_crafter(CleanerCrafter(strategy="median"))
chain.add_crafter(ModelCrafter(model_name="xgboost"))
results = chain.run(target_column="target")
Loading Saved Models
artifacts = DeployCrafter.load_model("model.joblib")
model = artifacts["model"]
metadata = artifacts["metadata"]
Requirements
- Python: 3.8 or higher
- Core Dependencies: pandas, scikit-learn, numpy, xgboost, joblib
Development
Setup Development Environment
git clone https://github.com/brkcvlk/mlfcrafter.git
cd mlfcrafter
pip install -r requirements-dev.txt
pip install -e .
Run Tests
# Run all tests
python -m pytest tests/ -v
# Run tests with coverage
python -m pytest tests/ -v --cov=mlfcrafter --cov-report=html
# Check code quality
ruff check .
# Auto-fix code issues
ruff check --fix .
# Format code
ruff format .
Run Examples
python example.py
Documentation
Complete documentation is available at MLFCrafter Docs
Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- 📖 Documentation: MLFCrafter Docs
- 🐛 Bug Reports: GitHub Issues
- 💬 Discussions: GitHub Discussions
Made for the ML Community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlfcrafter-0.1.0.tar.gz.
File metadata
- Download URL: mlfcrafter-0.1.0.tar.gz
- Upload date:
- Size: 22.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16a22537ec1a9ec6817f7c4e23bd0c172fca62961707585629731585a19d9395
|
|
| MD5 |
ca872150611df2e35d15fc5acefcf8c9
|
|
| BLAKE2b-256 |
14f6ed2d765fba0cff7784650861e74b081f98b84100291d482fa71376f6c21e
|
File details
Details for the file mlfcrafter-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mlfcrafter-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
190562de734124cee96ccb45a9ecaccbca9dc9d659366f289e6ae7efa53da7db
|
|
| MD5 |
fcbf043ed6f0ee34d16171ce94a215d0
|
|
| BLAKE2b-256 |
4b302f3ee8a243fadfa190151a77cef0f083e7478d9ac24dfb8bbf52e31558c6
|