Skip to main content

AutoML library for the full ML lifecycle — train, evaluate, visualise, and save

Project description

PyFlowML 🚀

A scalable, intelligent Python ML library for the full machine learning lifecycle.

PyFlowML handles everything from raw, potentially large data to a trained, evaluated, and saved model — automatically.


Features

Feature Description
🧠 AutoML Trains multiple models in parallel, picks the best automatically
📊 Data Profiler Detects problem type, missing data, skewness, and outliers
⚙️ Smart Pipeline Cached preprocessing: no redundant recomputation
📦 Memory Optimizer Downcasts dtypes — saves 40–60% RAM
🚀 Parallel Training All models train simultaneously via joblib
⏱️ Time Budget AutoClassifier(time_limit=60) stops when budget exceeded
📈 Visualization Dark-themed ROC, confusion matrix, learning curves
💾 Versioned Saves Models saved with metadata (date, metrics, features)
📝 NLP Utilities TF-IDF, Bag-of-Words, stopword removal, lemmatization
📡 Monitoring Per-step time and memory tracking

Installation

pip install -e .

Or install from requirements:

pip install -r requirements.txt

Quickstart

One-line AutoML Engine

import pandas as pd
from pyflowml.core.engine import PyFlowEngine

df = pd.read_csv("titanic.csv")
engine = PyFlowEngine(df, target="Survived", time_limit=120)
engine.run()

Manual Step-by-Step

from pyflowml.data import DataLoader, DataCleaner, DataSplitter, MemoryOptimizer
from pyflowml.preprocessing import SmartPipeline
from pyflowml.models import AutoClassifier
from pyflowml.evaluation import Reporter
from pyflowml.visualization import ModelViz
from pyflowml.utils import ModelSaver

# Load & optimize
df = DataLoader.from_csv("data.csv")
df = MemoryOptimizer.reduce(df)

# Clean
df = DataCleaner(df).handle_nulls().remove_outliers().remove_duplicates().result()

# Split
X_train, X_test, y_train, y_test = DataSplitter(df, target="label").split()

# Preprocess
pipe = SmartPipeline()
X_train = pipe.fit_transform(X_train, y_train)
X_test  = pipe.transform(X_test)

# Train best model
clf = AutoClassifier(metric="f1", time_limit=60)
clf.fit(X_train, y_train)
clf.leaderboard()

# Evaluate
Reporter.classification(clf, X_test, y_test)

# Visualize
ModelViz.confusion_matrix(clf, X_test, y_test)
ModelViz.roc_curve(clf, X_test, y_test)

# Save
ModelSaver.save(clf, "my_model", metadata={"f1": clf.best_score_})

Modules

pyflowml/
├── core/          # Engine, Profiler, Optimizer (brain of the system)
├── data/          # DataLoader, DataCleaner, DataSplitter, MemoryOptimizer
├── preprocessing/ # Scaler, FeatureSelector, SmartPipeline
├── models/        # AutoClassifier, AutoRegressor, AutoClusterer
├── evaluation/    # Reporter, CrossValidator
├── tuning/        # HyperTuner
├── visualization/ # Plotter, ModelViz
├── monitoring/    # StepTracker, Logger
├── utils/         # ModelSaver
└── nlp/           # TextCleaner, Vectorizer

Running Tests

pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyflowml-1.0.3.tar.gz (44.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyflowml-1.0.3-py3-none-any.whl (53.3 kB view details)

Uploaded Python 3

File details

Details for the file pyflowml-1.0.3.tar.gz.

File metadata

  • Download URL: pyflowml-1.0.3.tar.gz
  • Upload date:
  • Size: 44.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for pyflowml-1.0.3.tar.gz
Algorithm Hash digest
SHA256 dc9b8fbc6b7e5b16f267e5aba11dbeba3571c6b236cccd5346c8d259a27b1bee
MD5 09d19499a02f12d88d0becefd0f62eba
BLAKE2b-256 ed601eac91e49e4c158ad39a637339f197bd9b6585af05048eee9edeca9998e8

See more details on using hashes here.

File details

Details for the file pyflowml-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: pyflowml-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 53.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for pyflowml-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 acb3c84db207cbe72bd4bc3c96e5ab571f8d909aee00ae8ec9e75374a1ac16a1
MD5 692694e1db7207fa6d50ebcd8f0659e4
BLAKE2b-256 c6557aa5c44c8a23b2372d3fd88a2a567004ad69c8255ef4c0fe5ab94f18e5c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page