retrain-pipelines lowers the barrier to entry for the creation and management of professional machine learning retraining pipelines.

These details have not been verified by PyPI

Project links

Repository

Project description

uder_construction

This README is nowhere near ready yet.

logo_large

retrain-pipelines simplifies the creation and management of machine learning retraining pipelines. The package is designed to remove the complexity of building end-to-end ML retraining pipelines, allowing users to focus on their data and model-architecture. With pre-built, highly adaptable pipeline examples that work out of the box, users can easily integrate their own data and begin retraining models with minimal-to-no setup.

Key features of retrain-pipelines include:

Model version blessing: Automatically compare the performance of retrained models against previous best versions to ensure only superior models are deployed.
Infrastructure validation: Each retraining pipeline includes inference pipeline packaging, local Docker container deployment, and request/response validation to ensure that models are production-ready.
Comprehensive documentation: Every retraining pipeline is fully documented with sections covering Exploratory Data Analysis (EDA), hyperparameter tuning, retraining steps, model performance metrics, and key commands for retrieving training artifacts. Additionally, DAG information for the retraining process is readily available for pipeline transparency and debugging.

In essence, retrain-pipelines offers a seamless solution: "Come with your data, and it works," with the added benefit of flexibility for more advanced users to adjust and extend pipelines as needed.

Customizability & Adaptability

retrain-pipelines offers a high degree of flexibility, allowing users to tailor the pre-shipped pipelines to their specific needs:

Custom Preprocessing Functions: Users can provide their own Python functions for custom data preprocessing. For example, some built-in pipelines for tabular data allow optional bucketization of numerical features by name, but you can easily modify or extend these preprocessing steps to suit your dataset and feature requirements.
Custom Pipeline Card Generation: You can specify custom Python functions to generate pipeline cards, such as including specific performance charts or metrics relevant to your use case.
Custom HTML Templates: For further personalization, retrain-pipelines supports customizable HTML templates, enabling you to adjust formatting, insert specific charts, change page colors, or even add your company's logo to documentation pages.

retrain-pipelines doesn't just streamline the retraining process, it empowers teams to innovate faster, iterate smarter, and deploy more robust models with confidence. Whether you're looking for an out-of-the-box solution or a highly customizable pipeline, retrain-pipelines is your ultimate companion for continuous model improvement.

-- DRAFT --

- Say you use custom "preprocessing.py", "pipeline_card.py" and/or "template.html".
  If you chose to log the run on WandB, you can retrieve the versionned artifacts there afterwards via the WandB inspector "name_here" retrain-pipelines offers.

- incl. link to pypi here https://pypi.org/project/retrain-pipelines/

- all is fine to track your draft pipelines as you iterate on developping them, but keeping tracks of the artifacts generated during those dry runs on the other hand has no value. To address that and all the "..." that come with it, we propose sandboxing.
  Stateful yet ephemeral. Once your happy with a given ML retraining pipeline advancement, you're free to drop all the draft artifacts.

launch tests

pytest -s tests

build from source

cd pkg_src && python -m build --verbose

install from source (dev mode) via :

pip install -e pkg_src

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

0.1.1

Nov 4, 2024

This version

0.1.0

Oct 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retrain_pipelines-0.1.0.tar.gz (143.4 kB view details)

Uploaded Oct 14, 2024 Source

Built Distribution

retrain_pipelines-0.1.0-py3-none-any.whl (154.1 kB view details)

Uploaded Oct 14, 2024 Python 3

File details

Details for the file retrain_pipelines-0.1.0.tar.gz.

File metadata

Download URL: retrain_pipelines-0.1.0.tar.gz
Upload date: Oct 14, 2024
Size: 143.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for retrain_pipelines-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f0c90e108c8c22ca6992cfc37f18c59a5eb877ae8833e96baa720fc67b6e94f8`
MD5	`7aba12b525360fc28757d29a5b22118b`
BLAKE2b-256	`5c94860d2f2a069c9f73cb896c4251c3cb9309631d574dcd464c3b510ba59502`

See more details on using hashes here.

File details

Details for the file retrain_pipelines-0.1.0-py3-none-any.whl.

File metadata

Download URL: retrain_pipelines-0.1.0-py3-none-any.whl
Upload date: Oct 14, 2024
Size: 154.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for retrain_pipelines-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7bc7cfdef10816c66599a0a578c1d1ae84e25b07e6e1780e39d1100ea2264104`
MD5	`ad01e1cb68d82651edf93acf3b03cba7`
BLAKE2b-256	`286ba8feda8dafa98876fdebc5a49fd46a8ba25375d70662a3fcb2efde9a5ad5`