FASTQ-to-analysis-ready-CRAM Workflow Executor for Human Genome Sequencing
Project description
ftarc
FASTQ-to-analysis-ready-CRAM Workflow Executor for Human Genome Sequencing
Installation
$ pip install -U ftarc
Dependent commands:
pigzpbzip2bgziptabixsamtools(andplot-bamstats)gnuplotjavagatkcutadaptfastqctrim_galorebwaorbwa-mem2
Docker image
Pull the image from Docker Hub.
$ docker image pull dceoy/ftarc
Usage
Create analysis-ready CRAM files from FASTQ files
| input files | output files |
|---|---|
| read1/read2 FASTQ (Illumina) | analysis-ready CRAM |
-
Download hg38 resource data.
$ ftarc download --dest-dir=/path/to/download/dir
-
Write input file paths and configurations into
ftarc.yml.$ ftarc init $ vi ftarc.yml # => edit
Example of
ftarc.yml:--- reference_name: hs38DH adapter_removal: true metrics_collectors: fastqc: true picard: true samtools: true resources: reference_fa: /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa known_sites_vcf: - /path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz - /path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz - /path/to/Homo_sapiens_assembly38.known_indels.vcf.gz runs: - fq: - /path/to/sample01.WGS.R1.fq.gz - /path/to/sample01.WGS.R2.fq.gz - fq: - /path/to/sample02.WGS.R1.fq.gz - /path/to/sample02.WGS.R2.fq.gz - fq: - /path/to/sample03.WGS.R1.fq.gz - /path/to/sample03.WGS.R2.fq.gz read_group: ID: FLOWCELL-1 PU: UNIT-1 SM: sample03 PL: ILLUMINA LB: LIBRARY-1
-
Create analysis-ready CRAM files from FASTQ files
$ ftarc pipeline --yml=ftarc.yml --workers=2
Standard workflow:
- Trim adapters
trim_galore
- Map reads to a human reference genome
bwa mem(orbwa-mem2 mem)
- Mark duplicates
gatk MarkDuplicatesgatk SetNmMdAndUqTags
- Apply BQSR (Base Quality Score Recalibration)
gatk BaseRecalibratorgatk ApplyBQSR
- Remove duplicates
samtools view
- Validate output CRAM files
gatk ValidateSamFile
- Collect QC metrics
fastqcsamtoolsgatk
- Trim adapters
Preprocessing and QC-check
-
Validate BAM or CRAM files using Picard
$ ftarc validate /path/to/genome.fa /path/to/aligned.cram
-
Collect metrics from FASTQ files using FastQC
$ ftarc fastqc read1.fq.gz read2.fq.gz
-
Collect metrics from FASTQ files using FastQC
$ ftarc samqc /path/to/genome.fa /path/to/aligned.cram
-
Apply BQSR to BAM or CRAM files using GATK
$ ftarc bqsr \ --known-sites-vcf=/path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz \ --known-sites-vcf=/path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz \ --known-sites-vcf=/path/to/Homo_sapiens_assembly38.known_indels.vcf.gz \ /path/to/genome.fa /path/to/markdup.cram
-
Remove duplicates in marked BAM or CRAM files
$ ftarc dedup /path/to/genome.fa /path/to/markdup.cram
Run ftarc --help for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ftarc-0.2.4.tar.gz.
File metadata
- Download URL: ftarc-0.2.4.tar.gz
- Upload date:
- Size: 23.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcf36aefebbae1c88112f97387b54f7ed8ced5ee4034f90a5e6c3230ad9c8ce5
|
|
| MD5 |
d90ec190a449a06fe6a202a3a1814fe1
|
|
| BLAKE2b-256 |
ec775da537a87b52eb61779590d9956c98fa976df960eac5b54e1fbea29dc080
|
File details
Details for the file ftarc-0.2.4-py3-none-any.whl.
File metadata
- Download URL: ftarc-0.2.4-py3-none-any.whl
- Upload date:
- Size: 30.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f17230f737c105b115defdbd70618fdf0953da664cd5d21453eca318ecba2489
|
|
| MD5 |
e8e489cb715e5e4b27fa6f1dd6e64572
|
|
| BLAKE2b-256 |
ed41a63ebc081c5684f99a041930550cf40f40077e2ce300c5b218fd3de352e9
|