Deploy PlantIT workflows on laptops, servers, or clusters.
plant<small class="mb-3 text-success" style="text-decoration: underline;text-shadow: 1px 0 0 #000, 0 -1px 0 #000, 0 1px 0 #000, -1px 0 0 #000; color: #d6df5D" >IT</small><small style="text-decoration: underline;text-shadow: 1px 0 0 #000, 0 -1px 0 #000, 0 1px 0 #000, -1px 0 0 #000; color: #dee2e6">CLI</small>
Container orchestration for reproducible phenotyping workflows on laptops, servers, or HPC
- Parallel transfers to/from the CyVerse Data Store with Terrain API
- Deploy Docker images as Singularity containers on clusters/servers
- Compatible with any cluster scheduler supported by Dask-Jobqueue
This package must be installed and available in the $PATH on agents bound to PlantIT.
- Python 3.6.9+
To install the PlantIT CLI, use pip:
pip3 install plantit-cli
Once the CLI is installed it can be invoked with
The CLI supports the following commands:
pull: Download files from the CyVerse Data Store.
run: Run a workflow.
clean: Remove patterns from result files.
zip: Zip files produced by a workflow.
push: Upload files to the CyVerse Data Store.
To pull files from the
/iplant/home/shared/iplantcollaborative/testing_tools/cowsay/ directory in the CyVerse Data Store to the current working directory, use:
plantit terrain pull /iplant/home/shared/iplantcollaborative/testing_tools/cowsay/ --terrain_token <token>
Optional arguments are:
--local_path (-p): Local path to download files to.
--pattern: File patterns to include (one or more).
--overwrite: Whether to overwrite already-existing files.
PlantIT workflows are defined in YAML files. To run a workflow defined in
plantit run hello_world.yaml. At minimum, the schema should include the following attributes:
image: docker://alpine # Docker image workdir: /your/working/directory # working directory command: echo "Hello, world!" # entrypoint
Note that your
command may fail on some images if it contains
&&. If you must run multiple consecutive commands, it's probably best to package them into a script.
Runs involving inputs fall into 3 categories:
- spawn a single container to process a single file
- spawn a single container to process a single directory
- spawn a container per file to process files in a directory
To pull a file or directory, add an
To pull a file from the Data Store and spawn a single container to process it, use
kind: file and
from: <file path>:
input: kind: file path: /iplant/home/username/directory/file
To pull a directory from the Data Store and spawn a container for each file, use
kind: files and
from: <directory path>:
input: kind: files path: /iplant/home/username/directory patterns: # optional - jpg - png
To pull the contents of a directory from the Data Store and spawn a single container to process it, use
kind: directory and
from: <directory path>:
input: kind: directory path: /iplant/home/username/directory
If your code needs to write temporary files somewhere other than the (automatically mounted) host working directory, use the
bind_mounts: - /path/in/your/container # defaults to the host working directory - path/relative/to/host/working/directory:/another/path/in/your/container
CUDA GPU mode
To instruct Singularity to bind to NVIDIA GPU drivers on the host, add a
gpu: True attribute to your configuration.
On high-performance or high-throughput computing systems with a scheduler like Torque or SLURM, you can parallelize multi-file runs by adding a
jobqueue section like the following:
... jobqueue: slurm: cores: 1 processes: 10, project: '<your allocation>' walltime: '01:00:00' queue: '<your queue>'
slurm, or any other Dask Jobqueue cluster configuration section (the CLI uses Dask internally and passes your configuration directly through).
For clusters with virtual memory, you may need to use
header_skip to alter Dask's resource request from the scheduler:
... jobqueue: slurm: ... header_skip: - '--mem' # for clusters with virtual memory
Other resource requests
You can add other cluster-specific resource requests, like GPU-enabled nodes, with an
... jobqueue: slurm: ... extra: - '--gres=gpu:1'
Due to cluster scheduler configuration quirks, when invoking the CLI with Docker username/password these secrets may end up in job output files. The
clean command can be used to remove arbitrary patterns from files — it's recommended out of an abundance of caution to always use it prior to the
plantit clean myjob.1234567.out myjob.1234567.err -p 'secret_password'
To zip files all files in a directory, use
plantit zip <input directory>.
To include file patterns or names, use (one or more) flags
To exclude file patterns or names, use (one or more) flags
Included files are gathered first, then excludes are filtered out of this collection.
To push files in the current working directory to the
/iplant/home/<my>/<directory/ in the CyVerse Data Store, use
plantit terrain push /iplant/home/<my>/<directory/ --terrain_token <token>.
--local_path (-p): Local path to download files to.
--include_pattern (-ip): File patterns to include (one or more).
--include_name (-in): File names to include (one or more).
--exclude_pattern (-ep): File patterns to exclude (one or more).
--exclude_name (-en): File names to exclude (one or more).
include_...s are provided, only the file patterns and names specified will be included. If only
exclude_...s section are present, all files except the patterns and names specified will be included. If you provide both
exclude_... sections, the
include_... rules will first be applied to generate a subset of files, which will then be filtered by the
plantit ping command is used internally by the PlantIT web application to test whether the CLI is properly installed on user-defined agents.
Authenticating with Docker
To authenticate with Docker and bypass Docker Hub rate limits, provide a
--docker_password. For instance:
plantit run hello_world.yaml --docker_username <your username> --docker_password <your password>
This is only required for the
plantit run command.
Authenticating with Terrain
run commands use the Terrain API to access the CyVerse Data Store. Runs with inputs and outputs must provide a
--cyverse_token argument. For instance, to run
plantit run hello_world.yaml --cyverse_token 'eyJhbGciOiJSUzI1N...'
A CyVerse access token can be obtained from the Terrain API with a
GET request (providing username/password for basic auth):
Authenticating with PlantIT
run command is invoked,
--plantit_token options may be provided to authenticate with PlantIT's RESTful API and push task status updates and logs back to the web application. This is only intended for internal use — requests with an invalid token or for a nonexistent task will be rejected.
By default, the CLI will print all output to
stdout. If a
--plantit_token are provided, output will be POSTed back to the PlantIT web application (only output generated by the CLI itself — container output will just be printed to
stdout). This is suitable for most cluster deployment targets, whose schedulers should automatically capture job output. To configure the CLI itself to write container output to a file, add the following to your configuration file:
To set up a development environment, clone the repo with
git clone https://github.com/Computational-Plant-Science/plantit-cli.git. Then run
scripts/bootstrap.sh (this will pull/build images for a small
docker-compose SLURM cluster test environment).
To run unit tests:
docker compose -f docker-compose.test.yml run -w /opt/plantit-cli/runs slurmctld python3 -m pytest /opt/plantit-cli/plantit_cli/tests/unit -s
Note that integration tests invoke the Terrain API and may take some time to complete; they're rigged with a delay to allow writes to propagate from Terrain to the CyVerse Data Store (some pass/fail non-determinism occurs otherwise). To run integration tests:
docker compose -f docker-compose.test.yml run -w /opt/plantit-cli/runs slurmctld python3 -m pytest /opt/plantit-cli/plantit_cli/tests/integration -s
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size plantit_cli-0.3.14-py3-none-any.whl (36.4 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size plantit-cli-0.3.14.tar.gz (34.3 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for plantit_cli-0.3.14-py3-none-any.whl