Shell command library and Snakemake wrappers for Sequana pipelines
Project description
The Sequana Wrapper Repository
| Overview | Shell command library and Snakemake wrappers for Sequana pipelines |
| Status | Production (wrappers/ — maintenance only) / Active (shells/, snippets/) |
| Issues | Please fill a report on github/sequana/sequana-wrappers |
| Python version | Python 3.8+ |
| Citation | Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, doi:10.21105/joss.00352 |
Status and roadmap
This repository contains two independent mechanisms for providing bioinformatics tool commands to Sequana pipelines:
-
wrappers/— the original Snakemake wrapper system (Python scripts + condaenvironment.yaml). This tree is now in maintenance mode. No new wrappers will be added. Bug fixes will still be accepted, but all new development happens inshells/. See the rationale below for the full explanation of why. -
sequana_wrappers/shells/— the new shell command library. Versioned shell strings that work withcontainer:+shell:rules, with no Python inside the container. This is the active development track and the recommended approach for all new Sequana pipelines. -
sequana_wrappers/snippets/— versioned Python callables for pipeline steps that require Python logic but still benefit from shared, versioned definitions. Used viarun:blocks (notshell:). See the snippets section below for details.All wrappers available in shells except 3 (require Python imports from the sequana library — not expressible as pure bash):
- fastq_stats — uses sequana.FastQC + matplotlib
- freebayes_vcf_filter — uses sequana.VCF_freebayes Python class
- snpeff_add_locus_in_fasta — uses sequana.SnpEff.add_locus_in_fasta()
The
rulegraphrule formerly in wrappers has been migrated tosequana_wrappers/snippets/rulegraph/because it requires Python imports fromsequana_pipetools— it cannot run as a pure bash command inside a container.
Quick start — shells (recommended)
Install the package:
pip install sequana_wrappers
Use in a Snakemake pipeline via sequana_pipetools:
# In your pipeline’s .rules file — manager is a PipelineManager instance
rule minimap2:
input: ...
output: "{sample}/{sample}.sorted.bam"
container: "https://zenodo.org/record/7987999/files/samtools_1.17_minimap2_2.24.0.img"
shell: manager.get_shell("minimap2/align", "v1")
Quick start — snippets (Python run blocks)
When a pipeline step requires Python logic (host-side imports, file path
resolution, etc.) but you still want the code to be shared and versioned, use
get_run with a run: block:
rule rulegraph:
input: "Snakefile"
output: "rulegraph/rulegraph.svg"
params: configname="config.yaml"
run:
manager.get_run("rulegraph/run", "v1")(snakemake)
The snippet's execute(input, output, params) function runs on the host
(where sequana_pipetools and other Python dependencies are available) — no
container is involved.
Quick start — wrappers (legacy)
snakemake --wrapper-prefix https://github.com/sequana/sequana-wrappers
or with a local copy:
git clone git@github.com:sequana/sequana-wrappers.git sequana_wrappers
snakemake --wrapper-prefix git+file:///home/user/sequana_wrappers
If the environment variable SEQUANA_WRAPPERS is set to
git+file:///home/user/sequana_wrappers, all pipelines will automatically use
it as the --wrapper-prefix.
The shells/ directory — rationale and design
Background
Sequana pipelines use two mechanisms to provide bioinformatics tools:
- Wrappers (
wrappers/) — Python scripts (wrapper.py) plus a condaenvironment.yaml. Snakemake fetches and executes them via thewrapper:rule directive. - Containers — Apptainer/Singularity images (hosted on Zenodo/Damona)
referenced by the
container:rule directive.
The problem: wrapper: + container: are incompatible
When a Snakemake rule combines both wrapper: and container:, and the
pipeline is run with --use-singularity / --apptainer-prefix, Snakemake v7
executes the wrapper Python script inside the container using the
container's own Python binary.
Snakemake does not require Snakemake to be installed inside the container — it
bind-mounts its own site-packages from the host at /mnt/snakemake.
However, the container's Python binary must be ABI-compatible with the
host Python. Old bioconda/Damona images (Python 3.8) fail when the host runs
Python 3.10 because C-extension .so files compiled for 3.10 cannot be loaded
by Python 3.8:
ImportError: /mnt/snakemake/...so: cannot open shared object
A further sign of inverted concerns: Damona had to ship Python inside
tool containers (bwa, samtools, …) specifically to satisfy the wrapper
mechanism. A container for bwa should contain bwa, not a Python runtime
serving the pipeline framework.
Options considered
Option 1 — Update container images to Python 3.10
Rebuild all Damona images with a Python version matching the host. Wrappers would then work with containers as originally intended.
Rejected as the primary fix: containers would still carry Python only to serve the framework; images must be rebuilt every time the host Python is upgraded; the inverted concern is not resolved.
Option 2 — Remove container: from wrapper rules (short-term workaround)
Wrapper rules run on the host (or via --use-conda); only pure shell: rules
keep their container: directive.
Used as a temporary workaround in sequana_mapper while the shell library was being designed. Downside: Apptainer only covers a subset of rules.
Option 3 — Drop wrappers, inline shell: in each pipeline (simplest)
Replace every wrapper with a hand-written shell: block inside the pipeline
rule. No shared library.
Rejected: duplicates logic across all pipelines; maintenance burden; loses the reusability benefit of this repository entirely.
Option 4 — Shell command library in shells/ ✓ (chosen)
Return to the spirit of the early sequana approach: define reusable, versioned
shell command strings here, alongside the existing wrappers/. Pipelines
import these strings and use them in shell: + container: rules.
Why this wins:
| Property | Wrappers | Shell library |
|---|---|---|
| Reusable logic | Yes (Python) | Yes (string) |
| Python in container | Required | Not needed |
| Git tag checkout at run time | Yes | No |
| Damona images lean | No | Yes |
Works with --use-conda |
Yes | No |
| Apptainer compatible | Only if Python ABI matches | Always |
| Backward compatible | — | Yes (wrappers/ kept) |
The wrappers/ tree is kept untouched for full backward compatibility.
Design
Repository layout
sequana-wrappers/
├── wrappers/ # existing — kept for backward compat
│ ├── bwa/align/wrapper.py
│ └── ...
└── sequana_wrappers/
├── __init__.py # get_shell() and get_run()
├── shells/ # container-first shell strings
│ ├── bwa/
│ │ ├── align/
│ │ │ └── v1/cmd.py # frozen at release v1
│ │ └── build/
│ │ └── v1/cmd.py
│ ├── bamtools/stats/v1/cmd.py
│ └── ...
└── snippets/ # host-side Python callables
├── rulegraph/run/v1/code.py
└── ...
Versioning convention
Every shell script is named cmd.py and lives inside a named version
subdirectory. The structure is:
sequana_wrappers/shells/<tool>/<command>/<version>/cmd.py
Valid version names are:
vN(e.g.v1,v2) — frozen, reproducible snapshots. Once created, these files are never edited.dev— work-in-progress version used during active development. Adev/directory is created when new work begins on a command and removed (or renamed tovN) at release time. Nodev/directories exist in released versions of this package.
Every shell script is named cmd.py; the tool and command are encoded
entirely in the directory path. This makes future deeper nesting
(e.g. shells/bamtools/stats/paired/v1/cmd.py) natural without any
changes to the get_shell API.
There is no silent fallback between versions: requesting a version that does not exist raises an explicit error.
Version axes
Two version axes are completely independent:
| Axis | What it pins | Where it lives |
|---|---|---|
| Tool binary | e.g. bwa 0.7.17 |
Container image (Damona / Zenodo) |
| Shell command | e.g. v1 |
hardcoded per rule in the pipeline |
Each pipeline rule hardcodes its own shell command version independently —
one rule can use v1 while another uses v2 if only that command changed.
Shell file format
Each cmd.py exports a single CMD string using Snakemake's standard
{input}, {output}, {params}, {threads}, {log}, and {wildcards}
placeholders:
# sequana_wrappers/shells/bwa/align/v1/cmd.py
CMD = """\
mkdir -p {params.tmp_directory}
(bwa mem -t {threads} {params.options} {input.reference} {input.fastq} \
| sambamba view -t {threads} -S -f bam -o /dev/stdout /dev/stdin \
| sambamba sort /dev/stdin -o {output.sorted} -t {threads} \
--tmpdir={params.tmp_directory}) \
> {log} 2>&1
"""
Usage in a pipeline rule
get_shell is available as a method on the PipelineManager instance
(already present in every pipeline rules file) — no extra import needed.
The version is hardcoded per rule:
rule bwa:
input: ...
output: sorted="{sample}/{sample}.sorted.bam"
log: "{sample}/bwa/{sample}.log"
params: options=config["bwa"]["options"],
tmp_directory=config["bwa"]["tmp_directory"]
threads: 2
container: config['apptainers']['bwa']
shell: manager.get_shell("bwa/align", "v1")
Use "dev" during development before a versioned snapshot exists:
shell: manager.get_shell("bwa/align", "dev")
The container contains only the tool binaries — no Python, no Snakemake.
Adding or updating a shell command
During development:
- Create
sequana_wrappers/shells/<tool>/<command>/dev/with__init__.pyandcmd.py. - Use
manager.get_shell("<tool>/<command>", "dev")in the pipeline rule. - Test against the relevant Damona container.
At release time:
VERSION=v2
mkdir -p sequana_wrappers/shells/<tool>/<command>/${VERSION}
touch sequana_wrappers/shells/<tool>/<command>/${VERSION}/__init__.py
cp sequana_wrappers/shells/<tool>/<command>/dev/cmd.py \
sequana_wrappers/shells/<tool>/<command>/${VERSION}/cmd.py
rm -rf sequana_wrappers/shells/<tool>/<command>/dev
Then update the pipeline rule to manager.get_shell("<tool>/<command>", "v2")
and bump version in pyproject.toml. No git tag required — the directory
is the version.
The snippets/ directory — rationale
Some pipeline steps require Python logic that cannot be expressed as a pure
bash command string. Examples: generating a rule graph (needs
sequana_pipetools.DOTParser), post-processing VCF files with a custom Python
class, or running tools that depend on host-side Python libraries.
These steps cannot use shell: + container: (the container has no Python
runtime; and even if it did, the ABI mismatch problem described above applies).
They also cannot use wrapper: for the same ABI reason.
The solution is a snippets/ library of versioned Python callables that are
invoked inside Snakemake run: blocks, running entirely on the host where all
Python dependencies are available:
sequana_wrappers/snippets/<tool>/<command>/<version>/code.py
Each code.py exports an execute(input, output, params) function. Versioning
follows the same convention as shells (v1, v2, …, dev). get_run loads
the callable by path and version — no git-time fetching, no ABI concerns.
| Property | Wrappers | Shell library | Snippet library |
|---|---|---|---|
| Python in container | Required | Not needed | N/A (host-side) |
| Container needed | Optional | Yes | No |
| Snakemake directive | wrapper: |
shell: |
run: |
| Reusable & versioned | Yes | Yes | Yes |
| Python imports on host | Yes | No | Yes |
The repository layout extended with snippets:
sequana-wrappers/
├── wrappers/ # legacy — maintenance only
├── sequana_wrappers/
│ ├── shells/ # container-first shell strings
│ │ ├── bwa/align/v1/cmd.py
│ │ └── ...
│ └── snippets/ # host-side Python callables
│ ├── rulegraph/run/v1/code.py
│ └── ...
Notes for developers
Overview
wrappers/is in maintenance mode. Bug fixes are welcome; new wrappers are not accepted. All new tool commands should be added tosequana_wrappers/shells/instead. See the shells rationale for context.
The wrappers/ directory contains the legacy wrappers. Each sub-directory is dedicated to
a wrapper that is related to a given software/application. A sub directory may have several wrappers (e.g., bwa has a sub directory related to the indexing, and a sub directory related to mapping).
Here is an example of a wrapper tree structure:
fastqc
├── environment.yaml
├── README.md
├── test
│ ├── README.md
│ ├── Snakefile
│ ├── test_R1_.fastq
│ └── test_R2_.fastq
└── wrapper.py
Note that some software may have several sub wrappers (see the bowtie1 wrapper for instance).
A wrapper directory must contain a file called wrapper.py where the developers must provide the core of the wrapper. There is no specific instructions here except to write good code as much as possible (with comments).
A wrapper directory should have a test directory for continuous integration with a Snakefile to be tested and possibly data file Do not add large files here. A README.md should be added to explain the origin of the test data files. Finally, include your tests in the main test.py file of the root of the repository (not the wrapper itself).
For testing purposes, you should also add a file called environment.yaml to tell what are the required packages to be installed for the test (and wrapper) to work.
Finally, for the documentation, we ask the developer to create a README.md file described here below.
To test your new wrapper (called example here), type:
pytest test.py -k test_example
The config file
If required in a wrapper, parameters must be defined in a config.yaml file. Similarly for threading. Consider the following pointswhen writting a wrapper:
- The thread paramter should also be a parameter in config file.
- the params section should contain a key called options also define in the config file.
- keys or parameters related to directories and files should use the _directory or _file suffices. This is for Sequanix application to automatically recognised those options with a dedicated widget.
Consider this example:
rule falco:
input: whatever
output: whatever
log:
"samples/{sample}/falco.log"
threads:
config['falco']['threads']
params:
options=config['falco']['options'],
wkdir=config['falco']['working_directory']
wrapper:
"falco/wrappers/falco"
You config file will look like:
falco:
threads: 4
options="--verbose"
working_directory: 'here'
Naming arguments of the different sections
In all sections (e.g., input, params), if there is only one input, no need to name it, otherwise, please do.
rule example1:
input:
"test.bam"
output:
"test.sorted.bam"
...
but:
rule example1:
input:
"test.bam"
output:
bam="test.sorted.bam"
bai="test.sorted.bam.bai"
...
Documentation
Each wrapper should have a dedicated documentation explaining the input/output with a usage example. It should also document the expected configuration file. The file must be formatted in markdown. It must contain a Documentation and Example sub sections. If a Configuration section is found, it is also added to the documentation. This README.md file will be rendered automatically via a Sequana sphinx plugin. Consider the fastqc directory for a workable example rendered here.
Faqs
adding a new wrapper in practice
In ./wrappers, add a new wrapper. Copy the existing fastqc wrapper for instance. Edit the wrapper.py and design a test/Snakefile example for testing. Since you are a developer, you are problaby developping in a dedicated branch. Let us call it dev.
In the test/Snakefile, you should switch from the main to the dev in the wrapper path:
wrapper:
"dev/wrappers/my_new_wrapper"
In order to test your Snakefile, you first need to commit the wrapper.py. Then, execute the Snakefile:
snakemake -s Snakefile -j 1 --wrapper-prefix git+file:///YOURPATH/sequana-wrappers/ -f -p
If it fails, edit and commit your wrapper.py and execute again until your Snakefile and wrappers are functional.
Once done, switch back the wrapper path to the main branch:
wrapper:
"main/wrappers/my_new_wrapper"
Time to include the new wrapper in the continous integration. Go to the root of sequana-wrappers and add a functional test to the end of test.py. Then, test it:
pytest test.py -k my_new_wrapper -v
You are ready to push and create a pull-requests
Tagging
You may consider to add a tag. Our convention is to use a tag with the YEAR.MONTH.DAY where day and month do not include extra zeros. So you would have e.g.:
v23.11.11
v23.2.2
but not v23.02.02
We use annotated tags so the command is e.g.:
git tag -a v23.11.11
and then push it to github:
git push origin main v23.11.11
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sequana_wrappers-26.4.1.tar.gz.
File metadata
- Download URL: sequana_wrappers-26.4.1.tar.gz
- Upload date:
- Size: 24.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.10.14 Linux/6.14.5-100.fc40.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bce070e3041b49c40c14f64388be76d854267998bb5b21fb552df354fc09cadb
|
|
| MD5 |
0c84c21e0b886ccee5b8d09ae0621a0c
|
|
| BLAKE2b-256 |
6780fba4e47f518718bc6c5034536efbaf17b22e50bec839e47d38a80e3307d0
|
File details
Details for the file sequana_wrappers-26.4.1-py3-none-any.whl.
File metadata
- Download URL: sequana_wrappers-26.4.1-py3-none-any.whl
- Upload date:
- Size: 56.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.10.14 Linux/6.14.5-100.fc40.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf82f823a69b105f8b321ef8d05f273b32d3d3878e14ef84b0b9c3934f6010b5
|
|
| MD5 |
477c8764b0de5b38db06b2f7bbcb00e6
|
|
| BLAKE2b-256 |
1a941e2e6f3b4711ef1b88391cc65832c19ca046d74ac6bfaa45054261445cae
|