The sapporo-service is a standard implementation conforming to the Global Alliance for Genomics and Health (GA4GH) Workflow Execution Service (WES) API specification.
Project description
sapporo-service
The sapporo-service is a standard implementation conforming to the Global Alliance for Genomics and Health (GA4GH) Workflow Execution Service (WES) API specification.
We have also extended the API specification. For more details, please refer to sapporo-wes-spec-2.0.0.yml or SwaggerUI - sapporo-wes-2.0.0.
Installation and Startup
The sapporo-service requires Python 3.8 or later.
Install using pip:
python3 -m pip install sapporo
Start the service:
sapporo
After startup, access localhost:1122/docs to view the API documentation, or query GET /service-info to verify the service is running.
Using Docker
You can also run the sapporo-service using Docker. For Docker-in-Docker (DinD) setups, mount docker.sock, /tmp, and other necessary directories.
docker compose up -d
Supported Workflow Engines
The sapporo-service supports the following workflow engines:
- cwltool
- nextflow
- Toil (experimental)
- cromwell
- snakemake
- ep3 (experimental)
- StreamFlow (experimental)
Usage
View available options:
sapporo --help
usage: sapporo [-h] [--host] [--port] [--debug] [--run-dir] [--service-info]
[--executable-workflows] [--run-sh] [--url-prefix] [--base-url]
[--allow-origin] [--auth-config] [--run-remove-older-than-days]
The sapporo-service is a standard implementation conforming to the Global
Alliance for Genomics and Health (GA4GH) Workflow Execution Service (WES) API
specification.
options:
-h, --help show this help message and exit
--host Host address for the service. (default: 127.0.0.1)
--port Port number for the service. (default: 1122)
--debug Enable debug mode.
--run-dir Directory where the runs are stored. (default: ./runs)
--service-info Path to the service_info.json file.
--executable-workflows
Path to the executable_workflows.json file.
--run-sh Path to the run.sh script.
--url-prefix URL prefix for the service endpoints. (default: '',
e.g., /sapporo/api)
--base-url Base URL for downloading the output files of the
executed runs. The files can be downloaded using the
format: {base_url}/runs/{run_id}/outputs/{path}.
(default: http://{host}:{port}{url_prefix})
--allow-origin Access-Control-Allow-Origin header value. (default: *)
--auth-config Path to the auth_config.json file.
--run-remove-older-than-days
Clean up run directories with a start time older than
the specified number of days.
Run a Workflow
Execute workflows by calling POST /runs with the workflow document and parameters. See the ./tests/curl_example directory for example requests using curl.
After starting the service, access http://localhost:1122/docs to view API specifications through Swagger UI and execute requests directly from the browser.
Run Directory
The sapporo-service stores all workflow data in a "run directory" on the filesystem. Configure the location using --run-dir or the SAPPORO_RUN_DIR environment variable.
Structure:
$ tree run
.
└── 29
└── 29109b85-7935-4e13-8773-9def402c7775
├── cmd.txt
├── end_time.txt
├── exe
│ └── workflow_params.json
├── exit_code.txt
├── outputs
│ └── <output_file>
├── outputs.json
├── run.pid
├── run_request.json
├── runtime_info.json
├── start_time.txt
├── state.txt
├── stderr.log
├── stdout.log
├── system_logs.json
└── workflow_engine_params.txt
├── 2d
│ └── ...
└── sapporo.db
Runs can be deleted by removing their directories with rm.
As of version 2.0.0, an SQLite database (sapporo.db) indexes runs for faster GET /runs queries. The database is updated every 30 minutes. For real-time run status, use GET /runs/{run_id} or add latest=true to GET /runs.
run.sh
The run.sh script abstracts workflow engine execution. When POST /runs is called, the service forks run.sh after preparing run directory files. Modify this script to add or customize workflow engine support.
Override the default location using --run-sh or SAPPORO_RUN_SH.
Executable Workflows
Restrict which workflows can be executed using --executable-workflows or SAPPORO_EXECUTABLE_WORKFLOWS.
Format:
{
"workflows": [
"https://example.com/workflow.cwl"
]
}
An empty array allows all workflows. Each URL must be a remote resource (http/https). Any workflow_url in POST /runs must match this list or the request returns 400 Bad Request.
Query available workflows via GET /executable_workflows.
Download Output Files
List output files using GET /runs/{run_id}/outputs. Download all outputs as a zip file with ?download=true.
Download specific files using GET /runs/{run_id}/outputs/{path}.
Configure the base URL for downloads using --base-url or SAPPORO_BASE_URL. Files are accessible at {base_url}/runs/{run_id}/outputs/{path}.
Clean Up Run Directories
Automatically remove old run directories using --run-remove-older-than-days or SAPPORO_RUN_REMOVE_OLDER_THAN_DAYS.
Swagger UI
Access Swagger UI at http://localhost:1122/docs to explore API specifications and execute requests interactively.
Generate RO-Crate
The sapporo-service generates RO-Crate metadata (ro-crate-metadata.json) after each run. Retrieve it using GET /runs/{run_id}/ro-crate. Download the complete RO-Crate package as a zip file with ?download=true. See ./tests/ro-crate for details.
Authentication
Configure authentication via ./sapporo/auth_config.json:
{
"auth_enabled": true,
"idp_provider": "sapporo",
"sapporo_auth_config": {
"secret_key": "your_secure_secret_key_here",
"expires_delta_hours": 24,
"users": [
{
"username": "user1",
"password_hash": "$argon2id$v=19$m=65536,t=3,p=4$..."
}
]
},
"external_config": {
"idp_url": "https://keycloak.example.com/realms/your-realm",
"jwt_audience": "account",
"client_mode": "public",
"client_id": "sapporo-client",
"client_secret": "client-secret-here"
}
}
Override the location using --auth-config or SAPPORO_AUTH_CONFIG.
Configuration Fields
auth_enabled: Enable/disable authenticationidp_provider:sapporo(local) orexternal(IdP like Keycloak)sapporo_auth_config:secret_key: JWT signing key (must be strong, see Security)expires_delta_hours: JWT expiration time in hours (default: 24, max: 168)users: List of users withusernameandpassword_hash
external_config:idp_url: External IdP URL (must use HTTPS in production)jwt_audience: Expected JWT audience claimclient_mode:confidentialorpublicclient_id/client_secret: OAuth2 credentials for confidential mode
Authentication Endpoints
When authentication is enabled, the following endpoints require a valid JWT token:
GET /service-info(optional: provides user-specific counts when authenticated)GET /runsPOST /runsGET /runs/{run_id}POST /runs/{run_id}/cancelGET /runs/{run_id}/statusGET /runs/{run_id}/outputsGET /runs/{run_id}/outputs/{path:path}GET /runs/{run_id}/ro-crateDELETE /runs/{run_id}
Each run is associated with a username, ensuring users can only access their own runs.
Authentication: sapporo mode
For local authentication:
# Start sapporo-service
sapporo
# Get JWT token
TOKEN=$(curl -s -X POST \
-H "Content-Type: multipart/form-data" \
-F "username=user1" \
-F "password=yourpassword" \
localhost:1122/token | jq -r '.access_token')
# Verify token
curl -X GET -H "Authorization: Bearer $TOKEN" localhost:1122/me
# Access protected endpoints
curl -X GET -H "Authorization: Bearer $TOKEN" localhost:1122/runs
Authentication: external mode
In external mode, integrate with an IdP like Keycloak. Users authenticate with the IdP, which issues JWTs that the sapporo-service verifies.
See ./compose.keycloak.dev.yml for a Keycloak development setup example.
Security
Password Hashing
All passwords are stored as Argon2 hashes. Generate password hashes using the CLI:
python -m sapporo.cli hash-password
# Follow the prompts to enter and confirm your password
# Output: Password hash: $argon2id$v=19$m=65536,t=3,p=4$...
Or programmatically (not recommended for interactive use):
python -m sapporo.cli hash-password --password "your_password"
Secret Key Generation
Generate a cryptographically secure secret key:
python -m sapporo.cli generate-secret
# Output: Secret key: <44-character secure random string>
Important: In production mode (non-debug), weak secret keys are rejected. Always use a generated secret key in production deployments.
HTTPS for External IdP
When using external identity providers, HTTPS is required by default. This prevents token interception during authentication flows.
To allow HTTP connections during development (not recommended for production):
export SAPPORO_ALLOW_INSECURE_IDP=true
Development
Start the development environment:
docker compose -f compose.dev.yml up -d --build
docker compose -f compose.dev.yml exec app bash
# Inside the container
sapporo --debug
Run lint and tests:
# Lint and format
pylint ./sapporo
mypy ./sapporo
isort ./sapporo
# Test
pytest
Differences Between Sapporo Service 2.x and 1.x
- Changed from Flask to FastAPI
- Updated base GA4GH WES from 1.0.0 to 1.1.0
- Reorganized authentication with switchable methods
- Added SQLite database for faster
GET /runsqueries - Organized Python and Docker toolchain
- Simplified
executable_workflows.jsonto a list ofworkflow_urls - Full support for automatic run directory cleanup
- See
sapporo-wes-spec-2.0.0.ymlfor detailed API specifications
Adding New Workflow Engines to the Sapporo-service
The sapporo-service invokes workflow engines through run.sh. Edit this script to add or customize workflow engines. For an example, see the StreamFlow addition PR.
License
This project is licensed under the Apache-2.0 license. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sapporo-2.1.0.tar.gz.
File metadata
- Download URL: sapporo-2.1.0.tar.gz
- Upload date:
- Size: 59.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f081495189d10fa1b8294613c41414e9330b17fa3647a3fc756d4b17ef5d5cd9
|
|
| MD5 |
032940fbb2c0b79f216e07c59ea4f916
|
|
| BLAKE2b-256 |
b72340446c87c73daf751d1b5dd07aadfa3b11d3f5e07024d1cb8d99aa2a445a
|
File details
Details for the file sapporo-2.1.0-py3-none-any.whl.
File metadata
- Download URL: sapporo-2.1.0-py3-none-any.whl
- Upload date:
- Size: 60.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b92ee28eaec187f8ec84e77ead820b964bb6eb96b862595d212a2e697924ebf
|
|
| MD5 |
9507ee3269a1990ba78178a548045765
|
|
| BLAKE2b-256 |
9eef333a568f02905b01ab4fec7d2b66cd60238a30281dde0ab78199e784e567
|