Skip to main content

Pipelines for genomics analysis

Project description

SeqNado

Documentation PyPI Version PyPI Downloads Bioconda Bioconda Updated Release License

SeqNado logo

A Snakemake-based bioinformatics toolkit for analyzing sequencing data from ATAC-seq, ChIP-seq, CUT&Tag, RNA-seq, SNP analysis, Methylation, CRISPR screens, and Micro-Capture-C experiments.

Table of Contents


Modular, reproducible, and container-ready pipelines powered by Snakemake that take you from raw data to publication-ready results.

Key Features

  • Comprehensive Assay Support: Single framework for multiple sequencing assays
  • GEO/SRA Integration: Download and process public datasets directly from GEO/SRA repositories
  • Customizable Workflows: Easily modify parameters, use different tools for peak calling, bigwig generation etc.
  • User-Friendly CLI: Intuitive command-line interface that guides you through setup and execution
  • Multiomics Support: Analyze and integrate data from multiple sequencing assays in a single workflow
  • Snakemake-Powered: Modular workflows with automatic parallelization and resource management
  • Container-Ready: Fully containerized pipelines using Apptainer/Singularity for reproducibility
  • HPC-Optimized: Seamless integration with SLURM and local execution modes
  • Advanced Analysis:
    • Comprehensive QC with MultiQC reports
    • Peak calling with MACS2, SEACR, HOMER, and LanceOtron
    • Consensus peakset generation and quantification across samples
    • Spike-in normalization for ChIP-seq, ATAC-seq, and RNA-seq
    • Automated differential expression with DESeq2 for RNA-seq
    • Genome browser style plots with PlotNado
    • UCSC genome browser hub generation
    • ML-ready dataset creation
  • Flexible Configuration: Interactive CLI for setup, or scriptable non-interactive mode
  • Machine Learning Ready: Tools for preparing datasets for ML applications

Supported Assays

  • ATAC-seq (atac) - Chromatin accessibility profiling with TSS enrichment and fragment analysis
  • ChIP-seq (chip) - Protein-DNA interaction mapping with spike-in support
  • CUT&Tag (cat) - Low-input epigenomic profiling optimized for sparse signals
  • RNA-seq (rna) - Transcriptome analysis with automated DESeq2 differential expression
  • SNP Analysis (snp) - Variant detection and genotyping workflows
  • Methylation (meth) - Bisulfite/TAPS sequencing for DNA methylation analysis
  • CRISPR Screens (crispr) - Guide-level quantification and screen statistics
  • Micro-Capture-C (mcc) - Chromatin conformation capture analysis
  • Multiomics - Run multiple assay types together in a single integrated workflow

View detailed assay workflows

Installation

Via Mamba (Recommended)

Install from the Bioconda channel:

mamba create -n seqnado -c bioconda seqnado
mamba activate seqnado

Via uv (Fast Alternative)

Install using uv, a fast Python package installer:

uv venv seqnado-env
source seqnado-env/bin/activate  # On macOS/Linux; use 'seqnado-env\Scripts\activate' on Windows
uv pip install seqnado

Via Pip

Alternatively, install using pip:

pip install seqnado

Initialize SeqNado

After installation, initialize your SeqNado environment:

seqnado init

What this does:

  • Sets up genome configuration templates in ~/.config/seqnado/
  • Configures Apptainer/Singularity containers (if available)
  • Installs Snakemake execution profiles for local and cluster execution

Learn more about initialization

Quick Start

Complete workflow from installation to results in 5 steps:

1. Set Up Genome References

Before processing data, configure reference genomes for alignment:

# List available genomes
seqnado genomes list atac

# Build a custom genome
seqnado genomes build rna --fasta hg38.fasta --name hg38 --outdir /path/to/genomes

Complete genome setup guide

2. Create Project Configuration

Generate a configuration file and project directory for your experiment:

seqnado config atac

Output: A dated project directory with configuration file and FASTQ folder:

YYYY-MM-DD_ATAC_project/
├── config_atac.yaml    # Edit this to customize analysis parameters
└── fastqs/             # Place your FASTQ files here

Configuration options guide

3. Add FASTQ Files

Option A: Use Your Own Data

Symlink your raw sequencing data into the project directory:

ln -s /path/to/fastq/*.fastq.gz YYYY-MM-DD_ATAC_project/fastqs/

Option B: Download from GEO/SRA

Download public datasets directly from GEO/SRA repositories:

# Download data using a metadata TSV file
seqnado download metadata.tsv -o YYYY-MM-DD_ATAC_project/fastqs/ -a atac --cores 4

GEO/SRA download guide

Note: Use symbolic links to avoid duplicating large files.

4. Generate Sample Metadata

Create a metadata CSV that describes your experimental design:

seqnado design atac

Output: metadata_atac.csv — Edit this file to specify:

  • Sample names and groupings
  • Experimental conditions
  • Control/treatment relationships
  • DESeq2 comparisons (for RNA-seq)

Design file specification

5. Run the Pipeline

Execute the analysis workflow (choose one based on your environment):

# Local machine (uses all available cores)
seqnado pipeline atac --preset le

# HPC cluster with SLURM scheduler
seqnado pipeline atac --preset ss --queue short

# Multiomics mode (processes multiple assays together)
seqnado pipeline --preset ss  # Detects all config files in current directory

Pipeline execution details | Output files explained

Common Pipeline Options

Execution Presets: These presets configure Snakemake execution parameters for different environments. Our default presets are optimized for typical use cases on a SLURM based HPC cluster. These are saved in ~/.config/seqnado/ when you run seqnado init and can be customized as needed.

  • --preset le - Local execution (default, recommended for workstations)
  • --preset lc - Local execution using conda environments
  • --preset ss - SLURM scheduler (for HPC clusters)

Resource Management:

  • --queue short - Specify SLURM partition/queue name

This is only needed on HPC clusters and your cluster uses multiple partitions. The default queue can be set in the SLURM preset configuration file.

  • -s/--scale-resources 1.5 - Multiply memory/time requirements by 1.5×

This is useful on HPC clusters to ensure jobs have sufficient resources and reduce the likelihood of out-of-memory errors. Very useful when processing very deeply sequenced datasets and avoids needing to manually adjust resource requirements for each rule on the command line.

Commands passed to snakemake:

Any snakemake command line options will automatically be passed through when you run seqnado pipeline. This allows you to easily customize the execution of the workflow. For example, you can specify --rerun-incomplete to automatically rerun any failed or incomplete jobs, or --keep-going to continue running independent jobs even if some fail.

Very useful flags:

--rerun-incomplete - Automatically rerun any failed or incomplete jobs --keep-going - Continue running independent jobs even if some fail --unlock - Unlock the workflow if it becomes locked due to an error or interruption or the workflow was cancelled before completion. This allows you to fix the issue and then rerun the workflow without needing to manually delete the lock file.

Debugging & Testing:

  • -n - Dry run to preview commands without executing

All CLI options | HPC cluster setup

Documentation

For comprehensive guides and API documentation, visit:

📚 SeqNado Documentation

Key Topics

License

This project is licensed under GPL3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqnado-1.0.5.tar.gz (415.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seqnado-1.0.5-py3-none-any.whl (473.6 kB view details)

Uploaded Python 3

File details

Details for the file seqnado-1.0.5.tar.gz.

File metadata

  • Download URL: seqnado-1.0.5.tar.gz
  • Upload date:
  • Size: 415.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for seqnado-1.0.5.tar.gz
Algorithm Hash digest
SHA256 64bd069a32446aa090c93c386592b3f671e79910905e35b746bd38f65224bd2c
MD5 31f11c58e5dc6c88f2a62a3606d148e3
BLAKE2b-256 8042ab9c2b7caf130f70a44bf654261e08449cdaf651d2b86675229c3949c0cc

See more details on using hashes here.

File details

Details for the file seqnado-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: seqnado-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 473.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for seqnado-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6d91863f4acaed70dce169599d8505ef440cdbd330459dfe36c56b7891e0db0c
MD5 421fe7ef34a7eba0b7227e360a0c1c4d
BLAKE2b-256 c58797cc3a7213670147f5bce3a43aa961f2e6f69b7df6586c60b1858e43d6f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page