A fastqc pipeline from sequana project.
Project description
This is is the fastqc pipeline from the Sequana projet
- Overview:
Runs fastqc and multiqc on a set of Sequencing data to produce control quality reports
- Input:
A set of FastQ files (paired or single-end) compressed or not
- Output:
an HTML file summary.html (individual fastqc reports, mutli-samples report)
- Status:
production
- Wiki:
- Documentation:
This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io
- Citation:
Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352
Installation
You must install Sequana first (use –upgrade to get the latest version installed):
pip install sequana --upgrade
Then, just install this package:
pip install sequana_fastqc --upgrade
Usage
This command will scan all files ending in .fastq.gz found in the local directory, create a directory called fastqc/ where a snakemake pipeline is launched automatically. Depending on the number of files and their sizes, the process may be long:
sequana_fastqc --run
To know more about the options (e.g., add a different pattern to restrict the execution to a subset of the input files, change the output/working directory, etc):
sequana_pipelines_fastqc --help sequana_pipelines_fastqc --input-directory DATAPATH
This creates a directory fastq. You just need to execute the pipeline:
cd fastqc sh fastqc.sh # for a local run
This launch a snakemake pipeline. If you are familiar with snakemake, you can retrieve the fastqc.rules and config.yaml files and then execute the pipeline yourself with specific parameters:
snakemake -s fastqc.rules --cores 4 --stats stats.txt
Or use sequanix interface.
Please see the Wiki for more examples and features.
Tutorial
You can retrieve test data from sequana_fastqc (https://github.com/sequana/fastqc) or type:
wget https://raw.githubusercontent.com/sequana/fastqc/master/sequana_pipelines/fastqc/data/data_R1_001.fastq.gz wget https://raw.githubusercontent.com/sequana/fastqc/master/sequana_pipelines/fastqc/data/data_R2_001.fastq.gz
then, prepare the pipeline:
sequana_fastqc --input-directory . cd fastqc sh fastq.sh # once done, remove temporary files (snakemake and others) make clean
Just open the HTML entry called summary.html. A multiqc report is also available. You will get expected images such as the following one:
Please see the Wiki for more examples and features.
Requirements
This pipelines requires the following executable(s):
fastqc
falco (optional)
sequana (Python: pip install sequana)
For Linux users, we provide a singularity image available through damona:
pip install damona damona install fastqc # and add the ~/.config/damona/bin path to your binary PATH
Details
This pipeline runs fastqc in parallel on the input fastq files (paired or not) and then execute multiqc. A brief sequana summary report is also produced.
You may use falco instead of fastqc. This is experimental but seem to work for Illumina/FastQ files.
This pipeline has been tested on several hundreds of MiSeq, NextSeq, MiniSeq, ISeq100, Pacbio runs.
It produces a md5sum of your data. It copes with empty samples. Produces ready-to-use HTML reports, etc
Rules and configuration details
Here is the latest documented configuration file to be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file.
Changelog
Version |
Description |
---|---|
1.3.0 |
|
1.2.0 |
|
1.1.0 |
|
1.0.1 |
|
1.0.0 |
|
0.9.15 |
|
0.9.14 |
|
0.9.13 |
|
0.9.12 |
|
0.9.11 |
|
0.9.10 |
|
0.9.9 |
|
0.9.8 |
|
0.9.7 |
|
0.9.6 |
add the readtag option |
Contribute & Code of Conduct
To contribute to this project, please take a look at the Contributing Guidelines first. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.