Skip to main content

MindWare: Towards Efficient AutoML System.

Project description

license Build Status Issues Bugs Pull Requests Version Join the chat at https://gitter.im/volcano-ml Documentation Status


MindWare: Efficient open-source AutoML system for .

Volcano-ML is a powerful AutoML system, which automates feature engineering, algorithm selection and hyperparameter tuning. It is capable of improving its AutoML power by decomposing the entire large AutoML search space into small ones. The system executes like the eruption of a volcano, hence the name 'Volcano-ML'.

Volcano-ML is developed by DAIM Lab at Peking University. The goal of Volcano-ML is to make machine learning easier to apply both in industry and academia. Currently, Volcano-ML is compatible with: Python >= 3.6.


Characteristics

  • User friendliness. Volcano-ML needs few human assistance. To use Volcano-ML, the users can define the task by writing only a few lines of code, regardless of the techinical details of the execution of the system.
  • High extensibility. New state-of-the-art ML algorithms or feature engineer operations can be added to the system simply. The decomposition techniques in Volcano-ML ensures the efficiency of finding the best configurations over the enlarged search space.
  • Advanced characteristic. Volcano-ML provides special supports for large datasets. In addition, Volcano-ML enables transfer-learning, meta-learning techniques to make AutoML with more intelligent behaviors.

Releases

  • New release: v1.3 -released on xx-xx-2021.

Example

Here is a brief example that uses the package.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from mindware.utils.data_manager import DataManager
from mindware.estimators import Classifier

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1, stratify=y)
dm = DataManager(X_train, y_train)
train_data = dm.get_data_node(X_train, y_train)
test_data = dm.get_data_node(X_test, y_test)

clf = Classifier(time_limit=3600)
clf.fit(train_data)

pred = clf.predict(test_data)

For more details and characteristics, please check examples.


Visualization

TODO.


Installation

Before installing Volcano-ML, please install the necessary library swig.

Volcano-ML requires SWIG (>= 3.0, <4.0) as a build dependency, and we suggest you to download & install swig=3.0.12.

Then, you can install Volcano-ML itself. Volcano-ML supports and is tested on Ubuntu >= 16.04, macOS >= 10.14.1, and Windows 10 >= 1809. The installation requires a python environment that has python 64-bit >= 3.6.There are two ways to install Volcano-ML:

Installation via pip

Volcano-ML is available on PyPI. You can install it by tying:

pip install mindware

Manual installation from the github source

If you want to try latest code, please manually install Volcano-ML from source code by:

git clone https://github.com/thomas-young-2013/mindware.git && cd mindware
cat requirements/main.txt | xargs -n 1 -L 1 pip install
python setup.py install

Tips on Installing Swig

Linux:

On Arch Linux (or any distribution with swig4 as default implementation), you need to confirm that the version of SWIG is in (>= 3.0, <4.0).

We suggest you to install swig=3.0.12..

./configure
make & make install

MACOSX:

Before installing SWIG, you need to install pcre:

cd $pcre_dir
./configure
make & make install

Then add library path of /usr/local/lib for pcre:

LD_LIBRARY_PATH=/usr/local/lib:/usr/lib
export LD_LIBRARY_PATH

Finally, install Swig:

cd $swig_dir
./configure
make & make install

Before installing python package pyrfr=0.8.0, download source code from pypi:

cd $pyrfr_dir
python setup.py install

Windows:

You need to download swigwin, and then install Soln-ML.


Feedback


Related Projects

Targeting at openness and advancing state-of-art technology, we have also released another open source project.

  • OpenBOX: an open source system and service to efficiently solve generalized blackbox optimization problems.

We encourage researchers to leverage the project to accelerate the AI development and research.


Related Publications

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition Yang Li, Yu Shen, Wentao Zhang, Jiawei Jiang, Bolin Ding, Yaliang Li, Jingren Zhou, Zhi Yang, Wentao Wu, Ce Zhang and Bin Cui International Conference on Very Large Data Bases (VLDB 2021).

Efficient Automatic CASH via Rising Bandits
Yang Li, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang and Bin Cui Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020). https://ojs.aaai.org/index.php/AAAI/article/view/5910

MFES-HB: Efficient Hyperband with Multi-Fidelity Quality Measurements Yang Li, Yu Shen, Jiawei Jiang, Jinyang Gao, Ce Zhang and Bin Cui Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021). https://arxiv.org/abs/2012.03011

OpenBox: A Generalized Black-box Optimization Service Yang Li, Yu Shen, Wentao Zhang, Yuanwei Chen, Huaijun Jiang, Mingchao Liu, Jiawei Jiang, Jinyang Gao, Wentao Wu, Zhi Yang, Ce Zhang and Bin Cui ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD 2021). https://arxiv.org/abs/2106.00421


License

The entire codebase is under MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mindware-0.5.0.tar.gz (432.8 kB view details)

Uploaded Source

Built Distributions

mindware-0.5.0-py3.7.egg (978.4 kB view details)

Uploaded Source

mindware-0.5.0-py3-none-any.whl (556.6 kB view details)

Uploaded Python 3

File details

Details for the file mindware-0.5.0.tar.gz.

File metadata

  • Download URL: mindware-0.5.0.tar.gz
  • Upload date:
  • Size: 432.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.7.6

File hashes

Hashes for mindware-0.5.0.tar.gz
Algorithm Hash digest
SHA256 614891d92b88a6e6d7410e2cc7e2cb5b0a23bfaa44800a6463159d842a839b8a
MD5 439f941db9a00dc637527bdcfc3aa0c2
BLAKE2b-256 79ebf1ac84f1e484b855b8d19355c2c7154eba080f07813700c9277fb3f21fba

See more details on using hashes here.

File details

Details for the file mindware-0.5.0-py3.7.egg.

File metadata

  • Download URL: mindware-0.5.0-py3.7.egg
  • Upload date:
  • Size: 978.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.7.6

File hashes

Hashes for mindware-0.5.0-py3.7.egg
Algorithm Hash digest
SHA256 2cb984a85517facd142f35f3dc39d500f4ba677d9ad1f8141584a3447fd2632c
MD5 47fd0a79eb9d1799148cc438f331ff81
BLAKE2b-256 0887b02fc280c4bdf583f7a7fd702ce726631110a38fe867f645a16dd80c6454

See more details on using hashes here.

File details

Details for the file mindware-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: mindware-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 556.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.7.6

File hashes

Hashes for mindware-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51e52841d0fdceee045fd612ae9f42564b7e3f5602191e9d992ebb246582f5e0
MD5 e5c4cad2157a6503341ab7a1fa9771ac
BLAKE2b-256 734a23aa1d53b8727a3f78b775fd6552c706f95b984bccf812583785587bd4db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page