Caching Workflow Engine
CacheFlow is a caching workflow engine, capable of executing dataflows while reusing previous results where appropriate, for efficiency. It is very extensible and can be used in many projects.
- ☑ Python 3 workflow system
- ☑ Executes dataflows from JSON files
- ☐ Can also load from SQL database
- ☐ Parallel execution
- ☐ Streaming
- ☑ Extensible: can add new modules, new storage formats, new caching mechanism, new executors
- ☐ Pluggable: extensions can be installed from PyPI without forking
- ☑ Re-usable: can execute workflows by itself, but can also be embedded into applications. Some I plan on developing myself:
- Literate programming app: snippets or modules embedded into a markdown file, which are executed on render (similar to Rmarkdown). Results would be cached, making later rendering fast
- Integrate in some of my NYU research projects (VisTrails Vizier, D3M)
- ☐ Use Jupyter kernels as backends to execute code (giving me quick access to all the languages they support)
- ☐ Isolate script execution (to run untrusted Python/… code, for example with Docker)
- Make a super-scalable and fast workflow execution engine: I’d rather make executors based on Spark, Dask, Ray than re-implement those
Basic structures are here, extracted from D3M. Execution works.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size & hash SHA256 hash help||File type||Python version||Upload date|
|cacheflow-0.1-py3-none-any.whl (11.9 kB) Copy SHA256 hash SHA256||Wheel||py3|
|cacheflow-0.1.tar.gz (8.3 kB) Copy SHA256 hash SHA256||Source||None|