AutoML library for the full ML lifecycle — train, evaluate, visualise, and save
Project description
PyFlowML 🚀
A scalable, intelligent Python ML library for the full machine learning lifecycle.
PyFlowML handles everything from raw, potentially large data to a trained, evaluated, and saved model — automatically.
Features
| Feature | Description |
|---|---|
| 🧠 AutoML | Trains multiple models in parallel, picks the best automatically |
| 📊 Data Profiler | Detects problem type, missing data, skewness, and outliers |
| ⚙️ Smart Pipeline | Cached preprocessing: no redundant recomputation |
| 📦 Memory Optimizer | Downcasts dtypes — saves 40–60% RAM |
| 🚀 Parallel Training | All models train simultaneously via joblib |
| ⏱️ Time Budget | AutoClassifier(time_limit=60) stops when budget exceeded |
| 📈 Visualization | Dark-themed ROC, confusion matrix, learning curves |
| 💾 Versioned Saves | Models saved with metadata (date, metrics, features) |
| 📝 NLP Utilities | TF-IDF, Bag-of-Words, stopword removal, lemmatization |
| 📡 Monitoring | Per-step time and memory tracking |
Installation
pip install -e .
Or install from requirements:
pip install -r requirements.txt
Quickstart
One-line AutoML Engine
import pandas as pd
from pyflowml.core.engine import PyFlowEngine
df = pd.read_csv("titanic.csv")
engine = PyFlowEngine(df, target="Survived", time_limit=120)
engine.run()
Manual Step-by-Step
from pyflowml.data import DataLoader, DataCleaner, DataSplitter, MemoryOptimizer
from pyflowml.preprocessing import SmartPipeline
from pyflowml.models import AutoClassifier
from pyflowml.evaluation import Reporter
from pyflowml.visualization import ModelViz
from pyflowml.utils import ModelSaver
# Load & optimize
df = DataLoader.from_csv("data.csv")
df = MemoryOptimizer.reduce(df)
# Clean
df = DataCleaner(df).handle_nulls().remove_outliers().remove_duplicates().result()
# Split
X_train, X_test, y_train, y_test = DataSplitter(df, target="label").split()
# Preprocess
pipe = SmartPipeline()
X_train = pipe.fit_transform(X_train, y_train)
X_test = pipe.transform(X_test)
# Train best model
clf = AutoClassifier(metric="f1", time_limit=60)
clf.fit(X_train, y_train)
clf.leaderboard()
# Evaluate
Reporter.classification(clf, X_test, y_test)
# Visualize
ModelViz.confusion_matrix(clf, X_test, y_test)
ModelViz.roc_curve(clf, X_test, y_test)
# Save
ModelSaver.save(clf, "my_model", metadata={"f1": clf.best_score_})
Modules
pyflowml/
├── core/ # Engine, Profiler, Optimizer (brain of the system)
├── data/ # DataLoader, DataCleaner, DataSplitter, MemoryOptimizer
├── preprocessing/ # Scaler, FeatureSelector, SmartPipeline
├── models/ # AutoClassifier, AutoRegressor, AutoClusterer
├── evaluation/ # Reporter, CrossValidator
├── tuning/ # HyperTuner
├── visualization/ # Plotter, ModelViz
├── monitoring/ # StepTracker, Logger
├── utils/ # ModelSaver
└── nlp/ # TextCleaner, Vectorizer
Running Tests
pytest tests/ -v
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyflowml-1.0.1.tar.gz.
File metadata
- Download URL: pyflowml-1.0.1.tar.gz
- Upload date:
- Size: 36.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
659a1ba248f3cab9c4b9a5c36fbaff489d52243cd2eb3704a65e235eaa95354a
|
|
| MD5 |
b532372182842f2e1c6605ca4491bdec
|
|
| BLAKE2b-256 |
8f315e3bbe4c97ec1c6aca499ac38f3bc225bb566c9af63997f0bc8483b13ede
|
File details
Details for the file pyflowml-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pyflowml-1.0.1-py3-none-any.whl
- Upload date:
- Size: 44.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e00d67b606f0557b7c560f3abff58a7492cac3046015e80bb9bd6d1fc664c833
|
|
| MD5 |
aa38ddfc63dd32c69b3b0a791855ae87
|
|
| BLAKE2b-256 |
c895920b22a6c03894d25b49bebe097fbf3cc4170d2a3cc93fdba9a04b27a6b4
|