Skip to main content

Prefect scheduler for YAML configuration

Project description

Prefect YAML

CI Status Documentation Status Test coverage percentage

Poetry black pre-commit

PyPI Version Supported Python versions License

Package to run prefect with YAML configuration. For further details, please refer to the documentation

Installation

Install this via pip (or your favourite package manager):

pip install prefect-yaml

Usage

Run the command line prefect-yaml with the specified configuration file.

For example, the following YAML configuration is located in examples/simple_config.yaml.

metadata:
  output:
    directory: .output

task:
  task_a:
    caller: math:fabs
    parameters:
      - -9.0
    output:
      format: json
  task_b:
    caller: math:sqrt
    parameters:
      - !data task_a
    output:
      directory: null
  task_c:
    caller: math:fsum
    parameters:
      - [!data task_b, 1]

Run the following command to generate all the task outputs to the directory .output in the running directory.

prefect-yaml -c examples/simple_config.yaml

The output directory contains all the task outputs in the specified format.

% tree .output
.output
├── task_a.json
└── task_c.pickle

0 directories, 2 files

The expected behavior is to

  1. run task_a to dump the value fabs(-9.0) to the output directory in JSON format,
  2. run task_b to get the value sqrt(9.0) (from the output of task_a)
  3. run task_c to dump the value fsum([3.0, 1.0]) to the output directory in pickle format.

As the output directory in task_b is overridden as null, the output of task_b is passed to task_c in memory. Also, the output format in task_c is not specified so it is dumped in default format (pickle).

For further details, please see the section configuration in the documentation.

Configuration

The output section defines how the task writes and loads the task return. The section in metadata applies for all tasks globally while that in each task overrides the global parameters.

For further details, please see the documentation for parameter definitions in each section.

Output

The default output format is either pickle (default) or JSON, while users can define their own output format.

For example, if you would like to use pandas to load and dump the parquet file in pyarrow engine by default, you can define the configuration like below.

metadata:
  format: parquet
  dump-caller: object.to_parquet
  dump-parameters:
    engine: pyarrow
  load-caller: pandas:read_parquet
  load-parameters:
    engine: pyarrow

All the output parameters, like directory, dumper and loaders, can be overridden in the task level. You can also specify which tasks to export to the output directory, while the others to only be passed down to downstream in memory.

For further details, please see the output section in documentation.

Roadmap

Currently the project is still under development. The basic features are mostly available while the following features are coming soon

  • Multi cloud storage support
  • Subtasks supported in each task

Contributing

All levels of contributions are welcomed. Please refer to the contributing section for development and release guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prefect_yaml-2023.0.1.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

prefect_yaml-2023.0.1-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file prefect_yaml-2023.0.1.tar.gz.

File metadata

  • Download URL: prefect_yaml-2023.0.1.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.25.1 requests-toolbelt/0.10.1 urllib3/1.26.5 tqdm/4.64.1 importlib-metadata/6.0.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.6

File hashes

Hashes for prefect_yaml-2023.0.1.tar.gz
Algorithm Hash digest
SHA256 07bf464063b2b4ef5e51ac6a928fcaa5e6c52340469ff418e403756268e21999
MD5 aff58cc66809c810234122f72d4d5f53
BLAKE2b-256 0bdefedd91d55654ef138761e20db069b5b8d08360241cd5caed11a672989f45

See more details on using hashes here.

File details

Details for the file prefect_yaml-2023.0.1-py3-none-any.whl.

File metadata

  • Download URL: prefect_yaml-2023.0.1-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.25.1 requests-toolbelt/0.10.1 urllib3/1.26.5 tqdm/4.64.1 importlib-metadata/6.0.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.6

File hashes

Hashes for prefect_yaml-2023.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 26fc8ba23c1fdbf4241880bad8efcda9254ce929fcf2206bca048e23aad0e756
MD5 29172ce2015ea2ce3b45bd0044966645
BLAKE2b-256 463f200e738c006162f1cd8cead6dc4ce5ed8dc20d20e9751770f4e98df8ffc8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page