Schedule parameterized notebooks programmatically using cli or a REST API
Project description
NB Workflows
Description
If SQL is a lingua franca for querying data, Jupyter should be a lingua franca for data explorations, model training, and complex and unique tasks related to data.
NB Workflows is a library and a platform that allows you to run parameterized notebooks in a distributed way. A Notebook could be launched remotly on demand, or could be schedule by intervals or using cron syntax.
Internally it uses Sanic as web server, papermill as notebook executor, an RQ for task distributions and coordination.
Goal
Empowering different data roles in a project to put code into production, simplifying the time required to do so. It enables people to go from a data exploration instance to an entirely pipeline deployed in production, using the same notebook file made by a data scientist, analyst or whatever role working with data in an iterative way.
Features
- Define a notebook like a function, and execute it on demand
- Automatic Dockerfile generation. A project should share a unique environment
- Docker building and versioning: it build and track each release.
- Execution History, Notifications to Slack or Discord.
Roadmap
See Roadmap draft
Architecture
References & inspirations
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for nb_workflows-0.6.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b2f07b49bf18d406affb82357d6ae45b1895800d2a7ee86a08e3c9dcdb138ba |
|
MD5 | 5e287bdd9ea9fc9271cd3abe4ef680a2 |
|
BLAKE2b-256 | 8029486dbf10ee8508e83b25e6a567cca963e5e271368d9056cedd2458c14c4e |