Machine learning data flow for reproducible data science
Project description
IDEAL HACKDUCK PROJECT
Several pipelines for dataflow (Prefect):
- nothing -> data_generation -> save to disk
- preprocessing
- augmentation
- postprocessing
Model handle (Pytorch & Ignite):
- fit -> give X and Y and learn
- evaluate -> give X and Y, predict and return metrics
- predict -> give X, return Y
Save logs and artifacts (MLflow):
- save metrics during training (ignite)
- save a bunch of data before and after each pipeline
Run model from with a REST app (MLflow):
- save a github folder for each project
- can easely have predition on a bunch of data
FEATURES:
- seed for reproducibility
- map arguments to loop over a list
- mlflow integration (automatic logs parameters, can log metrics or artifacts)
- all prefect avantages
- handle subflows
- task bank to do basic operations
- unit test handle by ward
TODO:
[ ] map over subflows ? [ ] pip package for TaskBank and save commit (needed to rerun the flow) [ ] save python files inside mlruns/... and git them and save git commit [ ] being able to rerun a previous flow (save args and kwargs and output ref) [ ] run it in a docker [ ] put to prod thanks to travis CI that create the MLflow git repo [ ] do deep learning with it
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file HackDuck-0.1.0.tar.gz.
File metadata
- Download URL: HackDuck-0.1.0.tar.gz
- Upload date:
- Size: 3.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ef84297ef286323f725cda885f13d35521e955b26d3ec73e6b5875772115455
|
|
| MD5 |
d348b66e0612a694ce1a905365e5f587
|
|
| BLAKE2b-256 |
288938e7d3dc42e6d33b7a578cfb2f953c38059dee7a252818267384aca8a204
|
File details
Details for the file HackDuck-0.1.0-py3-none-any.whl.
File metadata
- Download URL: HackDuck-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12b0e18f1113d0c20c22afeb44d2901b876f0e74919c5f682ed5b926555ae153
|
|
| MD5 |
b8e1d063985112dbcbe8139ee3990fa7
|
|
| BLAKE2b-256 |
3192f8b8716eeb3f71e78137f4844159b56ddf64e047b7a9c1e62a2f8e41fe31
|