Skip to main content

Pipelines for genomics analysis

Project description

SeqNado

Documentation PyPI Version PyPI Downloads Bioconda Bioconda Updated Release License

SeqNado logo

A Snakemake-based bioinformatics toolkit for analyzing sequencing data from ATAC-seq, ChIP-seq, CUT&Tag, RNA-seq, SNP analysis, Methylation, CRISPR screens, and Micro-Capture-C experiments.

Table of Contents


Modular, reproducible, and container-ready pipelines powered by Snakemake that take you from raw data to publication-ready results.

Key Features

  • Comprehensive Assay Support: Single framework for multiple sequencing assays
  • GEO/SRA Integration: Download and process public datasets directly from GEO/SRA repositories
  • Customizable Workflows: Easily modify parameters, use different tools for peak calling, bigwig generation etc.
  • User-Friendly CLI: Intuitive command-line interface that guides you through setup and execution
  • Multiomics Support: Analyze and integrate data from multiple sequencing assays in a single workflow
  • Snakemake-Powered: Modular workflows with automatic parallelization and resource management
  • Container-Ready: Fully containerized pipelines using Apptainer/Singularity for reproducibility
  • HPC-Optimized: Seamless integration with SLURM and local execution modes
  • Advanced Analysis:
    • Comprehensive QC with MultiQC reports
    • Peak calling with MACS2, SEACR, HOMER, and LanceOtron
    • Consensus peakset generation and quantification across samples
    • Spike-in normalization for ChIP-seq, ATAC-seq, and RNA-seq
    • Automated differential expression with DESeq2 for RNA-seq
    • Genome browser style plots with PlotNado
    • UCSC genome browser hub generation
    • ML-ready dataset creation
  • Flexible Configuration: Interactive CLI for setup, or scriptable non-interactive mode
  • Machine Learning Ready: Tools for preparing datasets for ML applications

Supported Assays

  • ATAC-seq (atac) - Chromatin accessibility profiling with TSS enrichment and fragment analysis
  • ChIP-seq (chip) - Protein-DNA interaction mapping with spike-in support
  • CUT&Tag (cat) - Low-input epigenomic profiling optimized for sparse signals
  • RNA-seq (rna) - Transcriptome analysis with automated DESeq2 differential expression
  • SNP Analysis (snp) - Variant detection and genotyping workflows
  • Methylation (meth) - Bisulfite/TAPS sequencing for DNA methylation analysis
  • CRISPR Screens (crispr) - Guide-level quantification and screen statistics
  • Micro-Capture-C (mcc) - Chromatin conformation capture analysis
  • Multiomics - Run multiple assay types together in a single integrated workflow

View detailed assay workflows

Installation

Via Mamba (Recommended)

Install from the Bioconda channel:

mamba create -n seqnado -c bioconda seqnado
mamba activate seqnado

Via uv (Fast Alternative)

Install using uv, a fast Python package installer:

uv venv seqnado-env
source seqnado-env/bin/activate  # On macOS/Linux; use 'seqnado-env\Scripts\activate' on Windows
uv pip install seqnado

Via Pip

Alternatively, install using pip:

pip install seqnado

Initialize SeqNado

After installation, initialize your SeqNado environment:

seqnado init

What this does:

  • Sets up genome configuration templates in ~/.config/seqnado/
  • Configures Apptainer/Singularity containers (if available)
  • Installs Snakemake execution profiles for local and cluster execution

Learn more about initialization

Quick Start

Complete workflow from installation to results in 5 steps:

1. Set Up Genome References

Before processing data, configure reference genomes for alignment:

# List available genomes
seqnado genomes list atac

# Build a custom genome
seqnado genomes build rna --fasta hg38.fasta --name hg38 --outdir /path/to/genomes

Complete genome setup guide

2. Create Project Configuration

Generate a configuration file and project directory for your experiment:

seqnado config atac

Output: A dated project directory with configuration file and FASTQ folder:

YYYY-MM-DD_ATAC_project/
├── config_atac.yaml    # Edit this to customize analysis parameters
└── fastqs/             # Place your FASTQ files here

Configuration options guide

3. Add FASTQ Files

Option A: Use Your Own Data

Symlink your raw sequencing data into the project directory:

ln -s /path/to/fastq/*.fastq.gz YYYY-MM-DD_ATAC_project/fastqs/

Option B: Download from GEO/SRA

Download public datasets directly from GEO/SRA repositories:

# Download data using a metadata TSV file
seqnado download metadata.tsv -o YYYY-MM-DD_ATAC_project/fastqs/ -a atac --cores 4

GEO/SRA download guide

Note: Use symbolic links to avoid duplicating large files.

4. Generate Sample Metadata

Create a metadata CSV that describes your experimental design:

seqnado design atac

Output: metadata_atac.csv — Edit this file to specify:

  • Sample names and groupings
  • Experimental conditions
  • Control/treatment relationships
  • DESeq2 comparisons (for RNA-seq)

Design file specification

5. Run the Pipeline

Execute the analysis workflow (choose one based on your environment):

# Local machine (uses all available cores)
seqnado pipeline atac --preset le

# HPC cluster with SLURM scheduler
seqnado pipeline atac --preset ss --queue short

# Multiomics mode (processes multiple assays together)
seqnado pipeline --preset ss  # Detects all config files in current directory

Pipeline execution details | Output files explained

Common Pipeline Options

Execution Presets: These presets configure Snakemake execution parameters for different environments. Our default presets are optimized for typical use cases on a SLURM based HPC cluster. These are saved in ~/.config/seqnado/ when you run seqnado init and can be customized as needed.

  • --preset le - Local execution (default, recommended for workstations)
  • --preset lc - Local execution using conda environments
  • --preset ss - SLURM scheduler (for HPC clusters)

Resource Management:

  • --queue short - Specify SLURM partition/queue name

This is only needed on HPC clusters and your cluster uses multiple partitions. The default queue can be set in the SLURM preset configuration file.

  • -s/--scale-resources 1.5 - Multiply memory/time requirements by 1.5×

This is useful on HPC clusters to ensure jobs have sufficient resources and reduce the likelihood of out-of-memory errors. Very useful when processing very deeply sequenced datasets and avoids needing to manually adjust resource requirements for each rule on the command line.

Commands passed to snakemake:

Any snakemake command line options will automatically be passed through when you run seqnado pipeline. This allows you to easily customize the execution of the workflow. For example, you can specify --rerun-incomplete to automatically rerun any failed or incomplete jobs, or --keep-going to continue running independent jobs even if some fail.

Very useful flags:

--rerun-incomplete - Automatically rerun any failed or incomplete jobs --keep-going - Continue running independent jobs even if some fail --unlock - Unlock the workflow if it becomes locked due to an error or interruption or the workflow was cancelled before completion. This allows you to fix the issue and then rerun the workflow without needing to manually delete the lock file.

Debugging & Testing:

  • -n - Dry run to preview commands without executing

All CLI options | HPC cluster setup

Documentation

For comprehensive guides and API documentation, visit:

📚 SeqNado Documentation

Key Topics

License

This project is licensed under GPL3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqnado-1.0.3.tar.gz (376.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seqnado-1.0.3-py3-none-any.whl (418.7 kB view details)

Uploaded Python 3

File details

Details for the file seqnado-1.0.3.tar.gz.

File metadata

  • Download URL: seqnado-1.0.3.tar.gz
  • Upload date:
  • Size: 376.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for seqnado-1.0.3.tar.gz
Algorithm Hash digest
SHA256 2183f9656170a396e939e8ba7a18dd14478a3c36b8de38cdc7bdeb1382bf3a8f
MD5 32713ccef1001bf8d8d024088ffb5513
BLAKE2b-256 44605922ac13a8d8b16bb67987fee82c1b8a4255b6e884bf8817cb4ff668696c

See more details on using hashes here.

File details

Details for the file seqnado-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: seqnado-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 418.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for seqnado-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b536454aed15c249a76fb0c001dda6d1deea36826166dc1fd197660aac617e23
MD5 b2088bc4b1501ca6b00722c7a6d9d613
BLAKE2b-256 940d200b2d43decc8e75423744dff23703a170306e142d69b780663b6b1b4ce0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page