optimalflow

OptimalFlow is an Omni-ensemble Automated Machine Learning toolkit to help data scientists building optimal models in easy way, and automate Machine Learning workflow with simple code.

These details have not been verified by PyPI

Project links

Homepage

Project description

OptimalFlow

Author: Tony Dong

OptimalFlow is an Omni-ensemble Automated Machine Learning toolkit, which is based on Pipeline Cluster Traversal Experiment approach, to help data scientists building optimal models in easy way, and automate Supervised Learning workflow with simple codes.

In the latest version(0.1.10), it added a "no-code" Web App, based on flask framework, as OptimalFlow's GUI. Users could build Automated Machine Learning workflow all by clicks, without any coding at all! (Read more details https://optimal-flow.readthedocs.io/en/latest/webapp.html)

WebApp Demo

Comparing other popular "AutoML or Automated Machine Learning" APIs, OptimalFlow is designed as an omni-ensembled ML workflow optimizer with higher-level API targeting to avoid manual repetitive train-along-evaluate experiments in general pipeline building.

To achieve that, OptimalFlow applies Pipeline Cluster Traversal Experiments algorithm to assemble all cross-matching pipelines covering major tasks of Machine Learning workflow, and apply traversal-experiment to search the optimal baseline model.

Besides, by modularizing all key pipeline components in reuseable packages, it allows all components to be custom tunable along with high scalability.

The core concept in OptimalFlow is Pipeline Cluster Traversal Experiments, which is a theory, first raised by Tony Dong during Genpact 2020 GVector Conference, to optimize and automate Machine Learning Workflow using ensemble pipelines algorithm.

Comparing other automated or classic machine learning workflow's repetitive experiments using single pipeline, Pipeline Cluster Traversal Experiments is more powerful, with larger coverage scope, to find the best model without manual intervention, and also more flexible with elasticity to cope with unseen data due to its ensemble designs in each component.

In summary, OptimalFlow shares a few useful properties for data scientists:

Easy & less coding - High-level APIs to implement Pipeline Cluster Traversal Experiments, and each ML component is highly automated and modularized;
Well ensembled - Each key component is ensemble of popular algorithms w/ optimal hyperparameters tuning included;
Omni-Coverage - Using Pipeline Cluster Traversal Experiments, to cross-experiment with combined permutated input datasets, feature selection, and model selection;
Scalable - Each module could add new algorithms easily due to its ensemble and reuseable coding design;
Adaptable - Pipeline Cluster Traversal Experiments makes it easier to adapt unseen datasets with the right pipeline;
Custom Modify Welcomed - Support custom settings to add/remove algorithms or modify hyperparameters for elastic requirements.

Documentation: https://Optimal-Flow.readthedocs.io/

Installation

pip install OptimalFlow

Core Modules:

autoPP for feature preprocessing
autoFS for classification/regression features selection
autoCV for classification/regression model selection and evaluation
autoPipe for Pipeline Cluster Traversal Experiments
autoViz for pipeline cluster visualization. Current available: Model retrieval diagram
autoFlow for logging & tracking.

Notebook Demo:

An End-to-End OptimalFlow Automated Machine Learning Tutorial with Real Projects

Support OptimalFlow

If you like OptimalFlow, please consider starring or forking it on GitHub and spreading the word!

Please, Avoid Selling this Work as Yours

Voice from the Author: I am glad if you find OptimalFlow useful and helpful. Feel free to add it to your project and let more people know how good it is. But please avoid simply changing the name and selling it as your work. That's not why I'm sharing the source code, at all. All copyrights reserved by Tony Dong following MIT license.

License:

MIT

Updates History:

Updates on 9/29/2020

Added SearchinSpace settings page in Web App. Users could custom set estimators/regressors' parameters for optimal tuning outputs.
Modified some layouts of existing pages in Web App.

Updates on 9/16/2020

Created a Web App based on flask framework as OptimalFlow's GUI, to build PCTE Automated Machine Learning by simply clicks without any coding at all!
Web App included PCTE workflow bulder, LogsViewer, Visualization, Documentation sections.
Fix the filename issues in autoViz module, and remove auto_open function when generating new html format plots.

Updates on 8/31/2020

Modify autoPP's default_parameters: Remove "None" in "scaler", modify "sparsity" : [0.50], modify "cols" : [100]
Modify autoViz clf_table_report()'s coloring settings
Fix bugs in autoViz reg_table_report()'s gradient coloring function

Updates on 8/28/2020

Remove evaluate_model() function's round() bugs in coping with classification problem
Move out SVM based algorithm from fastClassifier & fastRegressor's default estimators settings
Move out SVM based algorithm from autoFS class's default selectors settings

Updates on 8/26/2020

Fix evaluate_model() function's bugs in coping with regression problem
Add reg_table_report() function to create dynamic table report for regression problem in autoViz

Updates on 8/24/2020

Fix evaluate_model() function's precision_score issue when running modelmulti-class classification problems
Add custom_selectors args for customized algorithm settings with autoFS's 2 classes(dynaFS_reg, dynaFS_clf)

Updates on 8/20/2020

Add Dynamic Table for Pipeline Cluster Model Evaluation Report in autoViz module
Add custom_estimators args for customized algorithm settings with autoCV's 4 classes(dynaClassifier,dynaRegressor,fastClassifier, and fastRegressor)

Updates on 8/14/2020

Add fastClassifier, and fastRegressor class which are both random parameter search based
Modify the display settings when using dynaClassifier in non in_pipeline mode

Updates on 8/10/2020

Stable 0.1.0 version release on Pypi

Updates on 8/7/2020

Add estimators: HuberRegressor, RidgeCV, LassoCV, SGDRegressor, and HistGradientBoostingRegressor
Modify parameters.json, and reset_parameters.json for the added estimators
Add autoViz for classification problem model retrieval diagram

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.11

Sep 29, 2020

0.1.10

Sep 17, 2020

0.1.9

Sep 17, 2020

0.1.8

Sep 17, 2020

0.1.7

Aug 30, 2020

0.1.6

Aug 28, 2020

0.1.5

Aug 26, 2020

0.1.4

Aug 25, 2020

0.1.3

Aug 21, 2020

0.1.2

Aug 14, 2020

0.1.1

Aug 11, 2020

0.1.0

Aug 10, 2020

0.0.1

Aug 10, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optimalflow-0.1.11.tar.gz (2.9 MB view details)

Uploaded Sep 29, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

optimalflow-0.1.11-py3-none-any.whl (2.9 MB view details)

Uploaded Sep 29, 2020 Python 3

File details

Details for the file optimalflow-0.1.11.tar.gz.

File metadata

Download URL: optimalflow-0.1.11.tar.gz
Upload date: Sep 29, 2020
Size: 2.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.4

File hashes

Hashes for optimalflow-0.1.11.tar.gz
Algorithm	Hash digest
SHA256	`5a2b9ed9a39aef74b407980cac728089f493d1a2f2787032d2b64a3b4ac414c7`
MD5	`fd65640e98fc6ba8f353c59455f0e306`
BLAKE2b-256	`d06809972d942d4f7343b2da822a5d827875a8c8aafbbac61c5129eefb46891f`

See more details on using hashes here.

File details

Details for the file optimalflow-0.1.11-py3-none-any.whl.

File metadata

Download URL: optimalflow-0.1.11-py3-none-any.whl
Upload date: Sep 29, 2020
Size: 2.9 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.4

File hashes

Hashes for optimalflow-0.1.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`366a697d712a1092f054cafab6d7d8162af4ab794dafc404b404cb406814d9cb`
MD5	`c3ee335b3d7bfda65711846330767ea0`
BLAKE2b-256	`4c5c9f0eedbb57adca19603be13e62f29154a976d6247be50f193f208f964837`

See more details on using hashes here.

optimalflow 0.1.11

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OptimalFlow

Author: Tony Dong

Documentation: https://Optimal-Flow.readthedocs.io/

Installation

Core Modules:

Notebook Demo:

An End-to-End OptimalFlow Automated Machine Learning Tutorial with Real Projects

Other Stories:

Support OptimalFlow

Please, Avoid Selling this Work as Yours

License:

Updates History:

Updates on 9/29/2020

Updates on 9/16/2020

Updates on 8/31/2020

Updates on 8/28/2020

Updates on 8/26/2020

Updates on 8/24/2020

Updates on 8/20/2020

Updates on 8/14/2020

Updates on 8/10/2020

Updates on 8/7/2020

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes