Skip to main content

MindWare: Towards Efficient AutoML System.

Project description

license Build Status Issues Bugs Pull Requests Version Join the chat at https://gitter.im/volcano-ml Documentation Status


MindWare: Efficient open-source AutoML system for .

Volcano-ML is a powerful AutoML system, which automates feature engineering, algorithm selection and hyperparameter tuning. It is capable of improving its AutoML power by decomposing the entire large AutoML search space into small ones. The system executes like the eruption of a volcano, hence the name 'Volcano-ML'.

Volcano-ML is developed by DAIM Lab at Peking University. The goal of Volcano-ML is to make machine learning easier to apply both in industry and academia. Currently, Volcano-ML is compatible with: Python >= 3.6.


Characteristics

  • User friendliness. Volcano-ML needs few human assistance. To use Volcano-ML, the users can define the task by writing only a few lines of code, regardless of the techinical details of the execution of the system.
  • High extensibility. New state-of-the-art ML algorithms or feature engineer operations can be added to the system simply. The decomposition techniques in Volcano-ML ensures the efficiency of finding the best configurations over the enlarged search space.
  • Advanced characteristic. Volcano-ML provides special supports for large datasets. In addition, Volcano-ML enables transfer-learning, meta-learning techniques to make AutoML with more intelligent behaviors.

Releases

  • New release: v1.3 -released on xx-xx-2021.

Example

Here is a brief example that uses the package.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from mindware.utils.data_manager import DataManager
from mindware.estimators import Classifier

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1, stratify=y)
dm = DataManager(X_train, y_train)
train_data = dm.get_data_node(X_train, y_train)
test_data = dm.get_data_node(X_test, y_test)

clf = Classifier(time_limit=3600)
clf.fit(train_data)

pred = clf.predict(test_data)

For more details and characteristics, please check examples.


Visualization

TODO.


Installation

Before installing Volcano-ML, please install the necessary library swig.

Volcano-ML requires SWIG (>= 3.0, <4.0) as a build dependency, and we suggest you to download & install swig=3.0.12.

Then, you can install Volcano-ML itself. Volcano-ML supports and is tested on Ubuntu >= 16.04, macOS >= 10.14.1, and Windows 10 >= 1809. The installation requires a python environment that has python 64-bit >= 3.6.There are two ways to install Volcano-ML:

Installation via pip

Volcano-ML is available on PyPI. You can install it by tying:

pip install mindware

Manual installation from the github source

If you want to try latest code, please manually install Volcano-ML from source code by:

git clone https://github.com/thomas-young-2013/mindware.git && cd mindware
cat requirements/main.txt | xargs -n 1 -L 1 pip install
python setup.py install

Tips on Installing Swig

Linux:

On Arch Linux (or any distribution with swig4 as default implementation), you need to confirm that the version of SWIG is in (>= 3.0, <4.0).

We suggest you to install swig=3.0.12..

./configure
make & make install

MACOSX:

Before installing SWIG, you need to install pcre:

cd $pcre_dir
./configure
make & make install

Then add library path of /usr/local/lib for pcre:

LD_LIBRARY_PATH=/usr/local/lib:/usr/lib
export LD_LIBRARY_PATH

Finally, install Swig:

cd $swig_dir
./configure
make & make install

Before installing python package pyrfr=0.8.0, download source code from pypi:

cd $pyrfr_dir
python setup.py install

Windows:

You need to download swigwin, and then install Soln-ML.


Feedback


Related Projects

Targeting at openness and advancing state-of-art technology, we have also released another open source project.

  • OpenBOX: an open source system and service to efficiently solve generalized blackbox optimization problems.

We encourage researchers to leverage the project to accelerate the AI development and research.


Related Publications

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition Yang Li, Yu Shen, Wentao Zhang, Jiawei Jiang, Bolin Ding, Yaliang Li, Jingren Zhou, Zhi Yang, Wentao Wu, Ce Zhang and Bin Cui International Conference on Very Large Data Bases (VLDB 2021).

Efficient Automatic CASH via Rising Bandits
Yang Li, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang and Bin Cui Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020). https://ojs.aaai.org/index.php/AAAI/article/view/5910

MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements Yang Li, Yu Shen, Jiawei Jiang, Jinyang Gao, Ce Zhang and Bin Cui Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021). https://arxiv.org/abs/2012.03011

OpenBox: A Generalized Black-box Optimization Service Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huaijun Jiang, Mingchao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, Ce Zhang and Bin Cui ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD 2021). https://arxiv.org/abs/2106.00421


License

The entire codebase is under MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mindware-0.5.0.tar.gz (432.8 kB view hashes)

Uploaded Source

Built Distributions

mindware-0.5.0-py3.7.egg (978.4 kB view hashes)

Uploaded Source

mindware-0.5.0-py3-none-any.whl (556.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page