Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud.
Project description
Join our community | Newsletter | Contact us | Docs | Blog | Website | YouTube
Notebooks are hard to maintain. Teams often prototype projects in notebooks, but maintaining them is an error-prone process that slows progress down. Ploomber overcomes the challenges of working with .ipynb
files allowing teams to develop collaborative, production-ready pipelines using JupyterLab or any text editor.
Installation
Compatible with Python 3.6 and higher.
Install with pip
:
pip install ploomber
Or with conda
:
conda install ploomber -c conda-forge
Getting started
Use Binder to try out Ploomber without setting up an environment:
Or run an example locally:
# ML pipeline example
ploomber examples -n templates/ml-basic -o ml-basic
cd ml-basic
# if using pip
pip install -r requirements.txt
# if using conda
conda env create --file environment.yml
conda activate ml-basic
# run pipeline
ploomber build
Pipeline output saved in the output/
folder. Check out the pipeline definition
in the pipeline.yaml
file.
To get a list of examples, run ploomber examples
.
Click here to go to our examples repository.
Community
Main Features
- Automated notebook refactoring. Automatically convert a legacy notebook into a maintainable, modular pipeline (see demo).
- Scripts as notebooks. Open
.py
files as notebooks, then execute them from the terminal and generate an output notebook to review results. - Dependency resolution. Quickly build a DAG by referring to previous tasks in your code; Ploomber infers execution order and orchestrates execution.
- Incremental builds. Speed up iterations by skipping tasks whose source code hasn't changed since the last execution.
- Production-ready. Deploy to Kubernetes (via Argo Workflows), Airflow, and AWS Batch without code changes.
- Parallelization. Run independent tasks in parallel.
- Testing. Import pipelines in any testing frameworks and test them with any CI service (e.g. GitHub Actions).
- Flexible. Use Jupyter notebooks, Python scripts, R scripts, SQL scripts, Python functions, or a combination of them as pipeline tasks. Write pipelines using a
pipeline.yaml
file or with Python.
Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ploomber-0.14.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37a4e3a0fe6a831d437637bb3392fb12354ce85c09ea4106899afc28ce02efe5 |
|
MD5 | 642f4be5f123a2401047ae3eea3ff6e7 |
|
BLAKE2b-256 | d08c66243b03c8432e62cb07396de353b530917b4f4e4e9edd5a25d2b5cbd398 |