Skip to main content

CWL runner for Kubernetes

Project description

Calrissian

CWL on Kubernetes

Build Workflow

PyPI version

Overview

Calrissian is a CWL implementation designed to run inside a Kubernetes cluster. Its goal is to be highly efficient and scalable, taking advantage of high capacity clusters to run many steps in parallel.

Cluster Requirements

Calrissian requires a Kubernetes or Openshift/OKD cluster, configured to provision PersistentVolumes with the ReadWriteMany access mode. Kubernetes installers and cloud providers don't usually include this type of storage, so it may require additional configuration.

Calrissian has been tested with NFS using the nfs-client-provisioner and with GlusterFS using OKD Containerized GlusterFS. Many cloud providers have an NFS offering, which integrates easily using the nfs-client-provisioner.

Scalability / Resource Requirements

Calrissian is designed to issue tasks in parallel if they are independent, and thanks to Kubernetes, should be able to run very large parallel workloads.

When running calrissian, you must provide a limit the the number of CPU cores (--max-cores) and RAM megabytes (--max-ram) to use concurrently. Calrissian will use CWL ResourceRequirements to track usage and stay within the limits provided. We highly recommend using accurate ResourceRequirements in your workloads, so that they can be scheduled efficiently and are less likely to be terminated or refused by the cluster.

calrissian parameters can be provided via a JSON configuration file either stored under ~/.calrissian/default.json or provided via the --conf option.

Below an example of such a file:

{
    "max_ram": "16G",
    "max_cores": "10",
    "outdir": "/calrissian",
    "tmpdir_prefix": "/calrissian/tmp"
}

CWL Conformance

Calrissian leverages cwltool heavily and most conformance tests for CWL v1.0. Please see conformance for further details and processes.

To view open issues related to conformance, see the conformance label on the issue tracker.

Setup

Please see examples for installation and setup instructions.

Environment Variables

Calrissian's behaviors can be customized by setting the following environment variables in the container specification.

Pod lifecycle

By default, pods for a job step will be deleted after termination

  • CALRISSIAN_DELETE_PODS: Default true. If false, job step pods will not be deleted.

Kubernetes API retries

When encountering a Kubernetes API exception, Calrissian uses a library to retry API calls with an exponential backoff. See the tenacity documentation for details.

  • RETRY_MULTIPLIER: Default 5. Unit for multiplying the exponent interval.
  • RETRY_MIN: Default 5. Minimum interval between retries.
  • RETRY_MAX: Default 1200. Maximum interval between retries.
  • RETRY_ATTEMPTS: Default 10. Max number of retries before giving up.

For developers

Installing for Development

Note that for development you can just use [Hatch] directly as described below.

Installing Hatch

The main tool that is used for development is [Hatch]. It manages dependencies (in a virtualenv that is created on the fly) and is also the command runner.

So first, [install it][install Hatch]. Ideally in an isolated way with pipx install hatch (after [installing pipx]), or just pip install hatch as a more well-known way.

Running tests

hatch run test:test

Verbose:

hatch run test:testv

Running test coverage

hatch run test:cov

Running calrissian

hatch run calrissian

Serve the documentation

hatch run docs:serve

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

calrissian-0.18.1.tar.gz (148.9 kB view details)

Uploaded Source

Built Distribution

calrissian-0.18.1-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file calrissian-0.18.1.tar.gz.

File metadata

  • Download URL: calrissian-0.18.1.tar.gz
  • Upload date:
  • Size: 148.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for calrissian-0.18.1.tar.gz
Algorithm Hash digest
SHA256 5a1931cbd7a58023daa3cf9212e45143a8acf189e78903f08912768a3db232b2
MD5 1b1e8682780ce325ed8a0608c8b00ceb
BLAKE2b-256 afd66fe2aaef95f0ae0cb3da36f02e863313bf4c1194fd94a917876066d84416

See more details on using hashes here.

Provenance

The following attestation bundles were made for calrissian-0.18.1.tar.gz:

Publisher: package.yaml on Duke-GCB/calrissian

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file calrissian-0.18.1-py3-none-any.whl.

File metadata

  • Download URL: calrissian-0.18.1-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for calrissian-0.18.1-py3-none-any.whl
Algorithm Hash digest
SHA256 51c2373fafe19bc7e9d7f61bf8b37330737a0c970c7458f7195640ff41e22a5f
MD5 26626ef444cdcdf2a65f117ac0e97135
BLAKE2b-256 ab11ffe9c37be2f4e70219e5ced2f357e9794ac6838eebf5bdf2394172c8fa26

See more details on using hashes here.

Provenance

The following attestation bundles were made for calrissian-0.18.1-py3-none-any.whl:

Publisher: package.yaml on Duke-GCB/calrissian

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page