Skip to main content

Funsies is a library to build and exectution engine for reproducible, composable and data-persistent computational workflows.

Project description

funsies

is a python library and execution engine to build reproducible, fault-tolerant, distributed and composable computational workflows.

  • 🐍 Workflows are specified in pure python.
  • 🐦 Lightweight with few dependencies.
  • 🚀 Easy to deploy to compute clusters and distributed systems.
  • 🔧 Can be embedded in your own apps.
  • 📏 First-class support for static analysis. Use mypy to check your workflows!

Workflows are encoded in a redis server and executed using the distributed job queue library RQ. A hash tree data structure enables automatic and transparent caching and incremental computing.

Source docs can be found here. Some example funsies scripts can be found in the recipes folder.

Installation

Using pip,

pip install funsies

This will enable the funsies CLI tool as well as the funsies python module. Python 3.7, 3.8 and 3.9 are supported. To run workflows, you'll need a redis server. Redis can be installed using conda,

conda install redis

or pip,

pip install redis-server

Hello, funsies!

To run workflows, three components need to be connected:

  • 📜 a python script describing the workflow
  • 💻 a redis server that holds workflows and data
  • 👷 worker processes that execute the workflow

funsies is distributed: all three components can be on different computers or even be connected at different time. Redis is started using redis-server, workers are started using funsies worker and the workflow is run using python.

First, we start a redis server,

$ redis-server &

Next, we write a little funsies "Hello, world!" script,

from funsies import execute, Fun, reduce, shell
with Fun():
    # you can run shell commands
    cmd = shell('sleep 2; echo 👋 🪐')
    # and python ones
    python = reduce(sum, [3, 2])
    # outputs are saved at hash addresses
    print(f"my outputs are saved to {cmd.stdout.hash[:5]} and {python.hash[:5]}")

The workflow is just a normal python script,

$ python hello-world.py
my outputs are saved to 4138b and 80aa3

The Fun() context manager takes care of connections. Running this workflow will take much less time than sleep 2 and does not print any greetings: funsies workflows are lazily evaluated.

A worker process can be started in the CLI,

$ funsies worker &
$ funsies execute 4138b 80aa3

Once the worker is finished, results can be printed directly to stdout using their hashes,

$ funsies cat 4138b
👋 🪐
$ funsies cat 80aa3
5

They can also be accessed from within python, from other steps in the workflows etc.

How does it work?

The design of funsies is inspired by git and ccache. All files and variable values are abstracted into a provenance-tracking DAG structure. Basically, "files" are identified entirely based on what operations lead to their creation. This (somewhat opinionated) design produces interesting properties that are not common in workflow engines:

Incremental computation

funsies automatically and transparently saves all input and output "files". This produces automatic and transparent checkpointing and incremental computing. Re-running the same funsies script, even on a different machine, will not perform any computations (beyond database lookups). Modifying the script and re-running it will only recompute changed results.

In contrast with e.g. Make, this is not based on modification date but directly on the data history, which is more robust to changes in the workflow.

Decentralized workflows

Workflows and their elements are not identified based on any global indexing scheme. This makes it possible to generate workflows fully dynamically from any connected computer node, to merge or compose DAGs from different databases and to dynamically re-parametrize them, etc.

No local file operations

All "files" are encoded in a redis instance, with no local filesystem operations. funsies workers can be operating without any permanent data storage, as is often the case in containerized deployment. File-driven workflows using only a container's tmpfs.

Is it production-ready?

🧪 warning: funsies is research-grade code ! 🧪

At this time, the funsies API is fairly stable. However, users should know that database dumps are not yet fully forward- or backward-compatible, and breaking changes are likely to be introduced on new releases.

Related projects

funsies is intended as a lightweight alternative to industrial workflow engines, such as Apache Airflow or Luigi. We rely heavily on awesome python libraries: RQ library, loguru, Click and chevron. We are inspired by git, ccache, snakemake targets, rain and others. A comprehensive list of other worfklow engine can be found here.

License

funsies is provided under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

funsies-0.7.0.tar.gz (47.5 kB view details)

Uploaded Source

Built Distribution

funsies-0.7.0-py3-none-any.whl (58.0 kB view details)

Uploaded Python 3

File details

Details for the file funsies-0.7.0.tar.gz.

File metadata

  • Download URL: funsies-0.7.0.tar.gz
  • Upload date:
  • Size: 47.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.1

File hashes

Hashes for funsies-0.7.0.tar.gz
Algorithm Hash digest
SHA256 d584b5af48ed5935e580ed951c689c8c22c6a3042e8002de4dee8d3f091bd7a1
MD5 4aea704e3a885d3ca24099d01c49c8aa
BLAKE2b-256 92f27d4c755b8a80fe377cba8ec1d9a51e659fae61fc7874ed2458bca1efa1cf

See more details on using hashes here.

File details

Details for the file funsies-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: funsies-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 58.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.1

File hashes

Hashes for funsies-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ef02dd9847cd226bbe4f958f2723cea27231a0f8929ccc21bf62b7730e91306d
MD5 06c7985d23227ff24d77b4423846405c
BLAKE2b-256 f6f4a969552e2c45960a5cccc7a2a2b97674597be2f3b3abe2a30279c6066c67

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page