Skip to main content

Infrastructure-as-code for ephemeral AWS ParallelCluster environments for bioinformatics

Project description

Daylily Ephemeral Cluster

Latest release Latest tag

Daylily stands up a short-lived AWS ParallelCluster, finishes the headnode configuration after pcluster itself reports success, gives the operator a validated Session Manager login shell as ubuntu, stages laptop-side inputs into the FSx-backed data plane, launches the workflow repo in tmux, exports results back to the backing S3 repository, and then tears the cluster down when the run is complete.

The bucket is durable. The cluster is ephemeral. Export before delete.

Supported Operator Contract

The supported path is:

  1. source ./activate
  2. daylily-ec preflight
  3. daylily-ec create
  4. daylily-ec headnode connect
  5. daylily-ec samples stage
  6. daylily-ec workflow launch
  7. daylily-ec export --target-uri analysis_results/ubuntu
  8. daylily-ec delete --dry-run
  9. daylily-ec delete

Supported remote access is AWS Systems Manager Session Manager landing directly in the ubuntu login shell. The repo hard-checks the Session Manager document and the effective remote user before supported command payloads run.

A cluster is not "ready" when CloudFormation or ParallelCluster first says the infrastructure exists. The supported readiness point is when daylily-ec create returns successfully after the post-create headnode configuration and bootstrap validation steps complete.

One Copy-Pasteable Lifecycle

source ./activate

export AWS_PROFILE=daylily-service-lsmc
export REGION=us-west-2
export REGION_AZ=us-west-2d
export CLUSTER_NAME=day-demo-$(date +%Y%m%d%H%M%S)
export DAY_EX_CFG="$HOME/.config/daylily/daylily_ephemeral_cluster.yaml"
export REF_BUCKET=s3://lsmc-dayoa-omics-analysis-us-west-2
export ANALYSIS_SAMPLES=etc/analysis_samples_template.tsv
export STAGE_CFG_DIR="$PWD/tmp-stage-config/$CLUSTER_NAME"
export EXPORT_DIR="$PWD/tmp-export/$CLUSTER_NAME"

daylily-ec preflight \
  --profile "$AWS_PROFILE" \
  --region-az "$REGION_AZ" \
  --config "$DAY_EX_CFG"

daylily-ec create \
  --profile "$AWS_PROFILE" \
  --region-az "$REGION_AZ" \
  --config "$DAY_EX_CFG"

daylily-ec headnode connect \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME"

daylily-ec samples stage \
  "$ANALYSIS_SAMPLES" \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --reference-bucket "$REF_BUCKET" \
  --config-dir "$STAGE_CFG_DIR"

# The manifest is row-oriented and multi-modality:
# - legacy Illumina rows can still use R1_FQ/R2_FQ
# - aligned inputs can be supplied directly through ULTIMA_CRAM, ONT_CRAM,
#   PB_BAM, ONT_BAM, or ROCHE_BAM columns
# - hybrid units populate multiple source groups on one row

# Use the "Remote FSx stage directory" printed by the staging helper.
daylily-ec workflow launch \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME" \
  --stage-dir "/fsx/data/staged_sample_data/remote_stage_<timestamp>" \
  --destination dayoa \
  --aligners sent \
  --dedupers dppl \
  --snv-callers sentd

daylily-ec export \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster-name "$CLUSTER_NAME" \
  --target-uri analysis_results/ubuntu \
  --output-dir "$EXPORT_DIR"

cat "$EXPORT_DIR/fsx_export.yaml"

daylily-ec delete --dry-run \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster-name "$CLUSTER_NAME"

daylily-ec delete \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster-name "$CLUSTER_NAME"

fsx_export.yaml is the machine-readable export receipt. A successful run writes status: success and the resolved S3 destination.

Architecture At A Glance

  1. daylily-ec is the control-plane CLI. It handles preflight, create, cluster inspection, export, delete, environment introspection, runtime checks, and pricing snapshots.
  2. The create flow renders the cluster configuration, calls ParallelCluster, then runs Daylily headnode configuration over Session Manager.
  3. The durable data plane is the S3 bucket plus the FSx for Lustre filesystem attached to the cluster. Laptop-side staging writes into the bucket-backed FSx namespace.
  4. The supported connect path is daylily-ec headnode connect, which opens Session Manager into the ubuntu login shell.
  5. Workflow launch happens from the operator machine through daylily-ec workflow launch, which creates a run directory at /home/ubuntu/daylily-runs/<session>/, writes launch.sh, tmux.log, and status.json, and starts the run inside tmux.
  6. Export uses the FSx data repository task API and writes fsx_export.yaml locally so the operator has a concrete export receipt before teardown.

What This Repo Ships

  • environment.yaml plus pyproject.toml: the DAY-EC environment contract
  • activate: checkout bootstrap that creates or repairs DAY-EC, installs the repo editable, and validates the local toolchain
  • daylily-ec headnode connect: interactive Session Manager shell launcher with ubuntu-only validation
  • daylily-ec headnode configure: explicit headnode configuration helper for repair or manual reruns
  • daylily-ec headnode info: full pcluster describe-cluster output for one cluster
  • daylily-ec headnode jobs: Slurm queue output using the same format as the headnode sq alias
  • daylily-ec cluster list/describe/wait: ParallelCluster inspection helpers
  • daylily-ec samples stage: translator and staging helper that turns a multi-modality analysis_samples.tsv into workflow-ready samples.tsv and units.tsv
  • daylily-ec workflow launch/status/logs: remote launcher and run-state inspection helpers
  • daylily-ec state list/show: local state-file inspection helpers
  • daylily_ec/ssh_to_ssm_e2e_runner.py: AWS-backed end-to-end runner that exercises the supported lifecycle through the repo CLI/helpers

AWS And Local Prerequisites

At minimum, the operator account needs:

  • a working named AWS profile
  • permission for STS identity lookup, IAM inspection/bootstrap, Service Quotas reads, S3 bucket discovery/access, EC2/VPC inspection, FSx, SSM, and ParallelCluster operations
  • a reference bucket in the target region that will back the cluster FSx filesystem
  • Session Manager document SSM-SessionManagerRunShell configured to run shell sessions as ubuntu in /home/ubuntu and source a login shell
  • enough regional quota for the requested cluster shape

Local toolchain for the supported path:

  • Conda
  • daylily-ec
  • aws
  • pcluster
  • session-manager-plugin
  • jq, yq, rclone, node, and the rest of the DAY-EC Conda layer

If any of this is missing, cluster creation will fail in annoying ways. Run daylily-ec preflight first and read the failures instead of guessing.

Cost, Time, And Failure Notes

  • daylily-ec create can take a long time. The ParallelCluster build alone can take tens of minutes, and Daylily still has headnode bootstrap work to finish after that.
  • The cluster is disposable; the export target is not. Do not delete until you have checked fsx_export.yaml.
  • The supported remote user is ubuntu. Any path that would land you as another user is a defect, not a supported fallback.
  • Session Manager misconfiguration is a hard stop. The repo does not tell operators to connect first and then switch users manually.

Read This Next

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

daylily_ephemeral_cluster-2.1.3.tar.gz (46.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

daylily_ephemeral_cluster-2.1.3-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file daylily_ephemeral_cluster-2.1.3.tar.gz.

File metadata

File hashes

Hashes for daylily_ephemeral_cluster-2.1.3.tar.gz
Algorithm Hash digest
SHA256 f8123ff80bde7999a6c5c058281828ba0222c608691c526c2c66995bc4b5a170
MD5 67b8d2013812bea4eba76dba2865d919
BLAKE2b-256 f57b710f3477548eecd67985de85bb0ca719663a5e7b1229a0fdf9d19f5be601

See more details on using hashes here.

File details

Details for the file daylily_ephemeral_cluster-2.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for daylily_ephemeral_cluster-2.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2ba019ab022a9bac2b160cee35f35000e77376a3879e40a03685ce09997138b8
MD5 e8fb7c032e019ed40d7fbebf4111834b
BLAKE2b-256 dfee4fe67f0bb0293c375a8a8fd51c8ab995a274b73183f2eb3e99d1df0b2190

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page