Skip to main content

No project description provided

Project description

PALMA

Project for Automated Learning MAchine

Maintenance pre-commit pytest

This library aims at providing tools for an automatic machine learning approach. As many tools already exist to establish one or the other component of an AutoML approach, the idea of this library is to provide a structure rather than to implement a complete service. In this library, a broad definition of AutoML is used : it covers the optimization of hyperparameters, the historization of models, the analysis of performances etc. In short, any element that can be replicated and that must, in most cases, be included in the analysis results of the models. Also, thanks to the use of components, this library is designed to be modular and allows the user to add his own analyses.
It therefore contains the following elements

  1. A vanilla approach described below (in basic usage section) and in the notebooks classification and regression

  2. A collection of components that can be added to enrich analysis

Install notice

python -m pip install git+https://github.com/eurobios-mews-labs/palma.git

Basic usage

Start your project

To start using the library, use the project class

import pandas as pd
from sklearn import model_selection
from sklearn.datasets import make_classification
from palma import Project

X, y = make_classification(n_informative=2, n_features=100)
X, y = pd.DataFrame(X), pd.Series(y).astype(bool)
project = Project(problem="classification", project_name="default")
project.start(
    X, y,
    splitter=model_selection.ShuffleSplit(n_splits=10, random_state=42),
)

The instantiation defines the type of problem and the start method will set what is needed to carry out ML project :

  • A testing strategy (argument splitter). That will define train and test instances. Note that we use cross validator from sklearn to do that. In the optimisation of hyper-parameters, a train test split will be operated, in this case, the first split will be used. This implies for instance that if you want 80/20 splitting method that shuffle the dataset, you should use
splitter = model_selection.ShuffleSplit(n_splits=5, random_state=42)
  • Training data X and target y

Run hyper-optimisation

The hyper-optimisation process will look for the best model in pool of models that tend to perform well on various problem. For this specific task we make use of FLAML module. After hyper parametrisation, the metric to track can be computed

from palma import ModelSelector

ms = ModelSelector(engine="FlamlOptimizer",
                   engine_parameters=dict(time_budget=30))
ms.start(project)
print(ms.best_model_)

Tailoring and analysing your estimator

from palma import ModelEvaluation
from sklearn.ensemble import RandomForestClassifier

# Use your own
model = ModelEvaluation(estimator=RandomForestClassifier())
model.fit(project)

# Get the optimized estimator
model = ModelEvaluation(estimator=ms.best_model_)
model.fit(project)

Manage components

You can add component to enrich the project. See here for a detailed documentation.

Authors

Eurobios Mews Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

palma-2023.2.1.tar.gz (48.6 kB view details)

Uploaded Source

Built Distribution

palma-2023.2.1-py3-none-any.whl (61.2 kB view details)

Uploaded Python 3

File details

Details for the file palma-2023.2.1.tar.gz.

File metadata

  • Download URL: palma-2023.2.1.tar.gz
  • Upload date:
  • Size: 48.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for palma-2023.2.1.tar.gz
Algorithm Hash digest
SHA256 94f539c4907308093b8a9179dc97c8225b0903ded08f469c4c05f11be81fa982
MD5 34b2a09d8124337adca12d447caa3f58
BLAKE2b-256 3782c3ebd0194083a4385ba1c19ad29036fd007e33dbba6f3a0ad712862e1dbd

See more details on using hashes here.

File details

Details for the file palma-2023.2.1-py3-none-any.whl.

File metadata

  • Download URL: palma-2023.2.1-py3-none-any.whl
  • Upload date:
  • Size: 61.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for palma-2023.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 17d2f4b2341143c3f3db2313e55a142c9f8535fa5e221dd6ef5d76cebddaf3a5
MD5 21a3cc1de34563242eebd16150f256ce
BLAKE2b-256 d5d6f2fcc54a2ef85900cd98a5f8421cee014c5ad0a7590a4d45f91845092425

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page