Analyze haplotypes from Illumina paired-end amplicon sequencing
Project description
CloneArmy
CloneArmy is a modern Python package for analyzing haplotypes from Illumina paired-end amplicon sequencing data. It provides a streamlined workflow for processing FASTQ files, aligning reads, and identifying sequence variants.
Features
- Fast paired-end read processing using BWA-MEM
- Quality-based filtering of bases and alignments
- Haplotype identification and frequency analysis
- Rich command-line interface with progress tracking
- Comprehensive output reports
- Multi-threading support
Installation
pip install cloneArmy
Requirements
- Python ≥ 3.8
- BWA (must be installed and available in PATH)
- Samtools (must be installed and available in PATH)
Usage
Command Line Interface
# Basic usage
cloneArmy /path/to/fastq/directory reference.fasta
# With all options
cloneArmy /path/to/fastq/directory reference.fasta \
--threads 8 \
--output results \
--min-base-quality 20 \
--min-mapping-quality 30
Python API
from pathlib import Path
from clone_army.processor import AmpliconProcessor
# Initialize processor
processor = AmpliconProcessor(
reference_path="reference.fasta",
min_base_quality=20,
min_mapping_quality=30
)
# Process a single sample
results = processor.process_sample(
fastq_r1="sample_R1.fastq.gz",
fastq_r2="sample_R2.fastq.gz",
output_dir="results",
threads=4
)
# Results are returned as a pandas DataFrame
print(results)
Output
For each sample, CloneArmy generates:
- A sorted BAM file with alignments
- A CSV file containing haplotype information:
- Sequence
- Read count
- Frequency
- Number of mutations
- Console output
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
clonearmy-0.1.1.tar.gz
(11.6 kB
view details)
File details
Details for the file clonearmy-0.1.1.tar.gz
.
File metadata
- Download URL: clonearmy-0.1.1.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ce9cf14075d3d061e4dc157fc83940156aae9d51c9cea5e9f0074b2e11be53b |
|
MD5 | 90fc7b626a0f5314c2c5dd10399cdf1f |
|
BLAKE2b-256 | cc99e5161e455c83b743e7c819113bf703f817e8ebd12c7471bb3141d9d23b6b |