Skip to main content

A Democratized lightweight and transparent AutoML framework

Project description

License Stars Forks Last Commit Commit Activity Docs

eZAutoML

eZAutoML Logo

Overview

eZAutoML is a framework designed to make Automated Machine Learning (AutoML) accessible to everyone. It provides an incredible easy to use interface based on Scikit-Learn API to build modelling pipelines with minimal effort.

The framework is built around a few core concepts:

  1. Optimizers: Black-box optimization methods for hyperparameters.
  2. Easy Tabular Pipelines: Simple domain-specific language to describe pipelines for preprocessing and model training.
  3. Scheduling: Work in progress; this feature enables horizontal scalability from a single computer to datacenters by using airflow executors.

Installation

Package Distribution

The latest version of eZAutoML can be installed via PyPI or from source.

pip install ezautoml
ezautoml --help

Install from source

To install from source, you can clone this repo and install with pip:

pip install -e .

Usage

Command Line Interface

Not only it can be used programatically but we provide an extremely lightweight CLI api to instantiate tabular AutoML pipelines with just a single command, for example:

ezautoml --dataset data/smoking.csv --target smoking --task classification --trials 10 --verbose   

Options:

  • dataset: Path to the dataset file (CSV, parquet...)
  • target: The target column name for prediction
  • task: Task type: classification/c or regression/r
  • search: Black-box optimization algorithm to perform
  • output: Directory to save the output models/results
  • trials: Maximum number of trials inside an optimiation algorithm
  • verbose: Increase logging verbosity
  • version: Show the current version

For more detailed help, use:

ezautoml --help

There are future features that are still a work-in-progress and will be enabled in the future such as scheduling, metalearning, pipelines...

Python Script

You can also use eZAutoML within Python scripts (though this feature is still being developed). This will allow you to work through Python code or via custom pipelines in the future.

    import time
    from sklearn.model_selection import train_test_split
    from sklearn.datasets import load_breast_cancer
    from sklearn.metrics import accuracy_score
    from ezautoml.model import eZAutoML
    from ezautoml.space.search_space import SearchSpace
    from ezautoml.evaluation.metric import MetricSet, Metric
    from ezautoml.evaluation.task import TaskType
    from ezautoml.optimization.optimizers.random_search import RandomSearchOptimizer

    # Load dataset (classification example)
    data = load_breast_cancer()
    X, y = data.data, data.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Define metrics for classification
    metrics = MetricSet(
        {"accuracy": Metric(name="accuracy", fn=accuracy_score, minimize=False)},
        primary_metric_name="accuracy"
    )
    # Load classification search space
    search_space = SearchSpace.from_yaml("classification_space.yaml")
    # Initialize eZAutoML for classification
    ezautoml = eZAutoML(
        search_space=search_space,
        task=TaskType.CLASSIFICATION,
        metrics=metrics,
        max_trials=10,
        max_time=600,  
        seed=42
    )
    ezautoml.fit(X_train, y_train)
    test_accuracy = ezautoml.test(X_test, y_test)
    ezautoml.summary(k=5)

Contributing

We welcome contributions to eZAutoML! If you'd like to contribute, please fork the repository and submit a pull request with your changes. For detailed information on how to contribute, please refer to our contributing guide.

License

eZAutoML is licensed under the BSD 3-Clause License. See the LICENSE file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ezautoml-0.5.0.tar.gz (30.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ezautoml-0.5.0-py3-none-any.whl (41.6 kB view details)

Uploaded Python 3

File details

Details for the file ezautoml-0.5.0.tar.gz.

File metadata

  • Download URL: ezautoml-0.5.0.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ezautoml-0.5.0.tar.gz
Algorithm Hash digest
SHA256 8a4e22cbe42c1914aefe991071dab1d497d02269835eb4994d007019f6c288db
MD5 f0b3d431ed715d12e3db6b4545954f5c
BLAKE2b-256 e0d5e9531e925eca3a6385503de32255d86cc0b277aa73a66f52a249477e7881

See more details on using hashes here.

File details

Details for the file ezautoml-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: ezautoml-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 41.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ezautoml-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1d450200e9e9d77cb9d74feb7b9d86c60b91df27997b14cc0cb0640691184b46
MD5 92ca59244f9ff51284aae2ea4b8e1a59
BLAKE2b-256 b8e22fcbd71ec063374888cb0e7325be4f328d6d5dbc0a0edce1214814aca63c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page