Skip to main content

A pipeline for designing primers optimized for droplet digital PCR

Project description

ddPrimer: Advanced Droplet Digital PCR Primer Design

License: MIT Python 3.10+

A comprehensive pipeline for designing primers and probes specifically optimized for droplet digital PCR (ddPCR).

Key Features

  • Complete End-to-End Pipeline: Design primers from genome sequences through a streamlined workflow using Primer3
  • ddPCR-specific utilities: Restriction cutting, GC% filtering, and optimized design parameters for ddPCR applications
  • Gene annotation filtering: Filter fragments based on gene overlap using GFF files for targeted design
  • SNP Masking: Avoid designing primers across variant positions using VCF files with AF-based processing
  • Thermodynamic Optimization: Calculate ΔG values using ViennaRNA to prevent unwanted secondary structures
  • Specificity Verification: Integrated BLAST validation for both primers and probes
  • File Preparation: Automatic VCF normalization, chromosome mapping, and file indexing
  • Output format: Results saved as Excel file with primer sequences, coordinates, thermodynamics, and quality metrics

Installation

Quick Install with Conda

# Clone and install
git clone https://github.com/globuzzz2000/ddPrimer
cd ddPrimer

# Create environment with all dependencies
conda create -n ddprimer python=3.8
conda activate ddprimer
conda install -c bioconda -c conda-forge blast bcftools samtools
pip install -e .

# Alternatively install external tools via system package manager:
# macOS: brew install blast bcftools samtools
# Linux: sudo apt-get install ncbi-blast+ bcftools samtools

Required External Tools

  • NCBI BLAST+: For specificity checking
  • bcftools and samtools: For file processing

Python dependencies are automatically installed via pip.

Quick Start

Command Line Usage

# Basic primer design with file preparation
ddprimer --fasta genome.fasta --vcf variants.vcf --gff annotations.gff

# Basic primer design without annotation filtering
ddprimer --noannotation --fasta genome.fasta --vcf variants.vcf

Interactive Mode

Simply run ddprimer without arguments to launch the interactive mode, which will guide you through file selection with a graphical interface.

Project Structure

ddPrimer/
├── ddprimer/                         # Main package directory
│   ├── __init__.py
│   ├── main.py                       # Main entry point for the pipeline
│   ├── core/                         # Core processing modules
│   │   ├── __init__.py
│   │   ├── annotation_processor.py   # GFF-based annotation filtering
│   │   ├── sequence_processor.py     # Sequence processing
│   │   ├── snp_processor.py          # VCF-based variant masking
│   │   ├── primer3_processor.py      # Primer3-based primer design
│   │   ├── primer_processor.py       # Primer parsing
│   │   ├── filter_processor.py       # Primer quality filtering
│   │   ├── thermo_processor.py       # ViennaRNA thermodynamic processor
│   │   └── blast_processor.py        # Primer specificity filtering
│   ├── utils/                        # Utility functions
│   │   ├── __init__.py
│   │   ├── file_preparator.py        # File validation and preparation
│   │   ├── file_io.py                # File I/O and Excel formatting
│   │   ├── blast_db_manager.py       # Unified BLAST database management
│   │   ├── direct_mode.py            # Target List-based Primer design
│   │   └── primer_remapper           # Primer coordinate remapping to different genome
│   └── config/                       # Configuration and settings
│       ├── __init__.py
│       ├── config.py                 # Core configuration settings
│       ├── config_display.py         # Configuration display
│       ├── exceptions.py             # Error handling
│       ├── logging_config.py         # Logging setup
│       └── template_generator.py     # Configuration template generation
├── pyproject.toml                    # Package configuration and dependencies
└── README.md                  

Workflow Overview

  1. Input Selection: Choose genome FASTA, variant VCF, and annotation GFF files
  2. File Preparation: Validate and prepare files (bgzip compression, indexing, normalization, chromosome mapping)
  3. Sequence Preparation: Filter sequences based on restriction sites and gene boundaries
  4. Variant Processing: Apply VCF variants to sequences with intelligent AF-based masking/substitution
  5. Primer Design: Design primer and probe candidates using Primer3
  6. Quality Filtering: Apply filters for penalties, repeats, GC content, and more
  7. Thermodynamic Analysis: Calculate secondary structure stability using ViennaRNA
  8. Specificity Checking: Validate specificity using BLAST
  9. Result Export: Generate comprehensive Excel output

Configuration

Customize the pipeline behavior with a JSON configuration file:

ddprimer --config config.json

Example configuration:

{
  "NUM_PROCESSES": 6,
  "SHOW_PROGRESS": true,
  "PRIMER_MIN_SIZE": 18,
  "PRIMER_OPT_SIZE": 20,
  "PRIMER_MAX_SIZE": 23,
  "PRIMER_MIN_GC": 50.0,
  "PRIMER_MAX_GC": 60.0,
  "MIN_SEGMENT_LENGTH": 90,
  "RETAIN_TYPES": "['gene']",
  "RESTRICTION_SITE": "GGCC",
  "PENALTY_MAX": 5.0,
}

Configuration Management

# Display current configuration
ddprimer --config

# Display Primer3 settings
ddprimer --config primer3

# Generate configuration template
ddprimer --config template

Additional Utilities

Direct Mode

Target-sequence based primer design using CSV/Excel input, bypassing genome-based processing:

CSV/Excel format: Two-column table with sequence ID and DNA sequence

# Basic direct mode with sequence table
ddprimer --direct sequences.csv

# Direct mode with SNP masking (Target sequences should exactly match reference sequence)
ddprimer --direct sequences.xlsx --snp --vcf variants.vcf --fasta reference.fa

BLAST Database Management

Create and manage BLAST databases for primer specificity checking:

# Create BLAST database from FASTA file
ddprimer --db genome.fasta

# Select from existing databases model organisms (E. coli, S. cerevisiae, etc.)
ddprimer --db

# Use custom database name
ddprimer --db genome.fasta my_custom_db

Primer Remapping

Update existing ddprimer output coordinates and annotations against a new reference genome:

# With gene annotation updates
ddprimer --remap primers.csv --fasta ref.fa --gff annotations.gff

# Skip annotation updates
ddprimer --remap primers.xlsx --fasta ref.fa --noannotation

Troubleshooting

Common issues and solutions:

  • Missing BLAST database: Run with --db to create or select a database
  • GUI errors: Use --cli to force command-line mode
  • File compatibility errors: The pipeline will attempt automatic file preparation

For more detailed output, run ddprimer --debug or check the logs in ~/.ddPrimer/logs/.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddprimer-0.1.0.tar.gz (119.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ddprimer-0.1.0-py3-none-any.whl (130.2 kB view details)

Uploaded Python 3

File details

Details for the file ddprimer-0.1.0.tar.gz.

File metadata

  • Download URL: ddprimer-0.1.0.tar.gz
  • Upload date:
  • Size: 119.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for ddprimer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8b7da536ddbd251b46ebeb38d9bcda4762b656e9b34fdd63683a55cc640a3f32
MD5 25bfe6300cf9b47a578be86426e82413
BLAKE2b-256 e8e77b1d12bf7d3eb938d4fa145a93eac646cfdfdafc525bec2fbffbefbb50ee

See more details on using hashes here.

File details

Details for the file ddprimer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ddprimer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 130.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for ddprimer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f29cb8bf2694bc73dc40fbfd2bb49aba9f13988932186b7fc9d0430cede4fa98
MD5 8c71b2ea29e8b65fa7d54865faa4bd0f
BLAKE2b-256 8f1a74f2f7dc9ddb37a2b15f7f72c77af2488a78793c432b90799ede9cb566e3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page