neuraxle

Neuraxle is a Machine Learning (ML) library for building neat pipelines, providing the right abstractions to both ease research, development, and deployment of your ML applications.

Project description

Code Machine Learning Pipelines - The Right Way.

https://img.shields.io/github/workflow/status/Neuraxio/Neuraxle/Test%20Python%20Package/master?:alt:Build

https://img.shields.io/gitter/room/Neuraxio/Neuraxle?:alt:Gitter

https://img.shields.io/pypi/l/neuraxle?:alt:PyPI-License

https://img.shields.io/pypi/dm/neuraxle?:alt:PyPI-Downloads

https://img.shields.io/github/v/release/neuraxio/neuraxle?:alt:GitHubrelease(latestbydate)

https://img.shields.io/codacy/grade/d56d39746e29468bac700ee055694192?:alt:Codacy

Neuraxle is a Machine Learning (ML) library for building clean machine learning pipelines using the right abstractions.

Component-Based: Build encapsulated steps, then compose them to build complex pipelines.
Evolving State: Each pipeline step can fit, and evolve through the learning process
Hyperparameter Tuning: Optimize your pipelines using AutoML, where each pipeline step has their own hyperparameter space.
Compatible: Use your favorite machine learning libraries inside and outside Neuraxle pipelines.
Production Ready: Pipeline steps can manage how they are saved by themselves, and the lifecycle of the objects allow for train, and test modes.
Streaming Pipeline: Transform data in many pipeline steps at the same time in parallel using multiprocessing Queues.

Documentation

You can find the Neuraxle documentation on the website. It also contains multiple examples demonstrating some of its features.

Installation

Simply do:

pip install neuraxle

Examples

We have several examples on the website.

For example, you can build a time series processing pipeline as such:

p = Pipeline([
    TrainOnlyWrapper(DataShuffler()),
    WindowTimeSeries(),

])

# Load data
X_train, y_train, X_test, y_test = generate_classification_data()

# The pipeline will learn on the data and acquire state.
p = p.fit(X_train, y_train)

# Once it learned, the pipeline can process new and
# unseen data for making predictions.
y_test_predicted = p.predict(X_test)

You can also tune your hyperparameters using AutoML algorithms such as the TPE:

# Define classification models with hyperparams.

# All SKLearn models can be used and compared to each other.
# Define them an hyperparameter space like this:
decision_tree_classifier = SKLearnWrapper(
    DecisionTreeClassifier(),
    HyperparameterSpace({
        'criterion': Choice(['gini', 'entropy']),
        'splitter': Choice(['best', 'random']),
        'min_samples_leaf': RandInt(2, 5),
        'min_samples_split': RandInt(2, 4)
    }))

# More SKLearn models can be added (code details skipped):
random_forest_classifier = ...
logistic_regression = ...

# It's possible to mix TensorFlow models into Neuraxle as well,
# using Neuraxle-Tensorflow' Tensorflow2ModelStep class, passing in
# the TensorFlow functions like create_model and create_optimizer:
minibatched_tensorflow_classifier = EpochRepeater(MiniBatchSequentialPipeline([
        Tensorflow2ModelStep(
            create_model=create_linear_model,
            create_optimizer=create_adam_optimizer,
            create_loss=create_mse_loss_with_regularization
        ).set_hyperparams_space(HyperparameterSpace({
            'hidden_dim': RandInt(6, 750),
            'layers_stacked_count': RandInt(1, 4),
            'lambda_loss_amount': Uniform(0.0003, 0.001),
            'learning_rate': Uniform(0.001, 0.01),
            'window_size_future': FixedHyperparameter(sequence_length),
            'output_dim': FixedHyperparameter(output_dim),
            'input_dim': FixedHyperparameter(input_dim)
        }))
    ]), epochs=42)

# Define a classification pipeline that lets the AutoML loop choose one of the classifier.
# See also ChooseOneStepOf documentation: https://www.neuraxle.org/stable/api/steps/neuraxle.steps.flow.html#neuraxle.steps.flow.ChooseOneStepOf
pipeline = Pipeline([
    ChooseOneStepOf([
        decision_tree_classifier,
        random_forest_classifier,
        logistic_regression,
        minibatched_tensorflow_classifier,
    ])
])

# Create the AutoML loop object.
# See also AutoML documentation: https://www.neuraxle.org/stable/api/metaopt/neuraxle.metaopt.auto_ml.html#neuraxle.metaopt.auto_ml.AutoML
auto_ml = AutoML(
    pipeline=pipeline,
    hyperparams_optimizer=TreeParzenEstimator(
        # This is the TPE as in Hyperopt.
        number_of_initial_random_step=20,
    ),
    validation_splitter=ValidationSplitter(validation_size=0.20),
    scoring_callback=ScoringCallback(accuracy_score, higher_score_is_better=True),
    n_trials=40,
    epochs=1,  # Could be higher if only TF models were used.
    hyperparams_repository=HyperparamsOnDiskRepository(cache_folder=neuraxle_dashboard),
    refit_best_trial=True,
    continue_loop_on_error=False
)

# Load data, and launch AutoML loop!
X_train, y_train, X_test, y_test = generate_classification_data()
auto_ml = auto_ml.fit(X_train, y_train)

# Get the model from the best trial, and make predictions using predict, as per the `refit_best_trial=True` argument to AutoML.
y_pred = auto_ml.predict(X_test)

accuracy = accuracy_score(y_true=y_test, y_pred=y_pred)
print("Test accuracy score:", accuracy)

Why Neuraxle ?

Most research projects don’t ever get to production. However, you want your project to be production-ready and already adaptable (clean) by the time you finish it. You also want things to be simple so that you can get started quickly. Read more about the why of Neuraxle here.

Community

For technical questions, please post them on StackOverflow using the neuraxle tag. The StackOverflow question will automatically be posted in Neuraxio’s Slack workspace and our Gitter in the #Neuraxle channel.

For suggestions, feature requests, and error reports, please open an issue.

For contributors, we recommend using the PyCharm code editor and to let it manage the virtual environment, with the default code auto-formatter, and using pytest as a test runner. To contribute, first fork the project, then do your changes, and then open a pull request in the main repository. Please make your pull request(s) editable, such as for us to add you to the list of contributors if you didn’t add the entry, for example. Ensure that all tests run before opening a pull request. You’ll also agree that your contributions will be licensed under the Apache 2.0 License, which is required for everyone to be able to use your open-source contributions.

Finally, you can as well join our Slack workspace and our Gitter to collaborate with us. We <3 collaborators. You can also subscribe to our mailing list where we will post some updates and news.

License

Neuraxle is licensed under the Apache License, Version 2.0.

Citation

You may cite our extended abstract that was presented at the Montreal Artificial Intelligence Symposium (MAIS) 2019. Here is the bibtex code to cite:

@misc{neuraxle,
author = {Chevalier, Guillaume and Brillant, Alexandre and Hamel, Eric},
year = {2019},
month = {09},
pages = {},
title = {Neuraxle - A Python Framework for Neat Machine Learning Pipelines},
doi = {10.13140/RG.2.2.33135.59043}
}

Contributors

Thanks to everyone who contributed to the project:

Guillaume Chevalier: https://github.com/guillaume-chevalier
Alexandre Brillant: https://github.com/alexbrillant
Éric Hamel: https://github.com/Eric2Hamel
Jérôme Blanchet: https://github.com/JeromeBlanchet
Michaël Lévesque-Dion: https://github.com/mlevesquedion
Philippe Racicot: https://github.com/Vaunorage
Neurodata: https://github.com/NeuroData-ltd
Klaimohelmi: https://github.com/Klaimohelmi
Vincent Antaki: https://github.com/vincent-antaki

Supported By

We thank these organisations for generously supporting the project:

Neuraxio Inc.: https://github.com/Neuraxio
Umanéo Technologies Inc.: https://www.umaneo.com/
Solution Nexam Inc.: https://nexam.io/
La Cité, LP: https://www.lacitelp.com/accueil
Kimoby: https://www.kimoby.com/

Project details

Release history Release notifications | RSS feed

This version

0.8.1

Aug 16, 2022

0.8.0

Jul 22, 2022

0.7.2

Jul 22, 2022

0.7.1

Jul 22, 2022

0.7.0

Apr 15, 2022

0.6.1

Oct 17, 2021

0.6.0

Jun 29, 2021

0.5.7

Feb 25, 2021

0.5.6

Sep 20, 2020

0.5.5

Sep 15, 2020

0.5.4

Sep 11, 2020

0.5.3

Sep 10, 2020

0.5.2

Jul 20, 2020

0.5.0

Jul 10, 2020

0.4.1

May 25, 2020

0.4.0

Apr 5, 2020

0.3.4

Mar 12, 2020

0.3.3

Mar 12, 2020

0.3.2

Feb 19, 2020

0.3.1

Jan 16, 2020

0.3.0

Dec 25, 2019

0.2.2

Dec 19, 2019

0.2.1

Oct 30, 2019

0.2.0

Oct 24, 2019

0.1.1

Sep 26, 2019

0.1.0

Jun 27, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuraxle-0.8.1.tar.gz (159.9 kB view details)

Uploaded Aug 16, 2022 Source

File details

Details for the file neuraxle-0.8.1.tar.gz.

File metadata

Download URL: neuraxle-0.8.1.tar.gz
Upload date: Aug 16, 2022
Size: 159.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/63.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for neuraxle-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`394ce071a8a1a7adef2f36e79d3ad9a8ac3e612be7f30fc6d874d3e4b9b97959`
MD5	`9a2fe06c7e1ba1b356309457df7d4ba1`
BLAKE2b-256	`f2a48b477a0dae198a9048a991fa9fea4cccfc306c0fb031d39ef7461ed1ff1e`

See more details on using hashes here.

neuraxle 0.8.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Documentation

Installation

Examples

Why Neuraxle ?

Community

License

Citation

Contributors

Supported By

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes