Skip to main content

Universal Read Analysis of DIMErs

Project description

URAdime

URAdime (Universal Read Analysis of DIMErs) is a Python package for analyzing primers in BAM files. It provides tools for identifying and analyzing primer sequences in DNA sequencing data.

Installation

pip install uradime

Usage

URAdime can be used both as a command-line tool and as a Python package.

Command Line Interface

# Basic usage
uradime -b input.bam -p primers.tsv -o results/my_analysis

# Full options
uradime \
    -b input.bam \                    # Input BAM file
    -p primers.tsv \                  # Primer file (tab-separated)
    -o results/my_analysis \          # Output prefix
    -t 8 \                           # Number of threads
    -m 1000 \                        # Maximum reads to process (0 for all)
    -c 100 \                         # Chunk size for parallel processing
    -u \                             # Process only unaligned reads
    --max-distance 2 \               # Maximum Levenshtein distance for matching
    -v                               # Verbose output

Python Package

from uradime import bam_to_fasta_parallel, create_analysis_summary, load_primers

# Load and analyze BAM file
result_df = bam_to_fasta_parallel(
    bam_path="your_file.bam",
    primer_file="primers.tsv",
    num_threads=4
)

# Load primers for analysis
primers_df, _ = load_primers("primers.tsv")

# Create analysis summary
summary_df, matched_pairs, mismatched_pairs = create_analysis_summary(result_df, primers_df)

Input Files

Primer File Format (TSV)

The primer file should be tab-separated with the following columns:

  • Name: Primer pair name
  • Forward: Forward primer sequence
  • Reverse: Reverse primer sequence
  • Size: Expected amplicon size

Example:

Name    Forward             Reverse             Size
Pair1   ATCGATCGATCG       TAGCTAGCTAGC       100
Pair2   GCTAGCTAGCTA       CGATTCGATCGA       150

Output Files

The tool generates several CSV files with the analysis results:

  • *_summary.csv: Overall analysis summary
  • *_matched_pairs.csv: Reads with matching primer pairs
  • *_mismatched_pairs.csv: Reads with mismatched primer pairs
  • *_wrong_size_pairs.csv: Reads with correct primer pairs but wrong size

Features

  • BAM file analysis
  • Primer sequence identification
  • Flexible matching with Levenshtein distance
  • Comprehensive analysis reporting
  • Parallel processing support
  • Both CLI and Python API

Requirements

  • Python ≥3.7
  • pysam
  • pandas
  • biopython
  • python-Levenshtein
  • tqdm
  • numpy

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uradime-0.1.1.tar.gz (100.1 MB view details)

Uploaded Source

Built Distribution

uradime-0.1.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file uradime-0.1.1.tar.gz.

File metadata

  • Download URL: uradime-0.1.1.tar.gz
  • Upload date:
  • Size: 100.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for uradime-0.1.1.tar.gz
Algorithm Hash digest
SHA256 134559b070cdcbdf5b8fc9be48fc11485141bcd80c2ddecd565a379a256aa2a1
MD5 742f43d6d1e74a4b09a47b554697014e
BLAKE2b-256 fc91a796a23061dd7bb57196a69b01bed984b850d8f37ac878a0a943e57b0246

See more details on using hashes here.

File details

Details for the file uradime-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: uradime-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for uradime-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 89d80ebbfa51e04347200e420df6efffa266af5aab72bf6833429202024619ad
MD5 aac986ea321f4eccca05d551be8c3938
BLAKE2b-256 2e147f6c2c00b57ef670542e7d3fa62a66d3be6facb3429f9244c217bf54d14d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page