A fastqc pipeline from sequana project.
Project description
This is is the fastqc pipeline from the Sequana projet
Overview: | Runs fastqc and multiqc on a set of Sequencing data to produce control quality reports |
---|---|
Input: | A set of FastQ files (paired or single-end) compressed or not |
Output: | an HTML file summary.html (individual fastqc reports, mutli-samples report) |
Status: | production |
Wiki: | https://github.com/sequana/fastqc/wiki |
Documentation: | This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io |
Citation: | Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352 |
Installation
You must install Sequana first (use –upgrade to get the latest version installed):
pip install sequana --upgrade
Then, just install this package:
pip install sequana_fastqc --upgrade
Usage
This command will scan all files ending in .fastq.gz found in the local directory, create a directory called fastqc/ where a snakemake pipeline is launched automatically. Depending on the number of files and their sizes, the process may be long:
sequana_fastqc --run
To know more about the options (e.g., add a different pattern to restrict the execution to a subset of the input files, change the output/working directory, etc):
sequana_pipelines_fastqc --help sequana_pipelines_fastqc --input-directory DATAPATH
This creates a directory fastq. You just need to execute the pipeline:
cd fastqc sh fastqc.sh # for a local run
This launch a snakemake pipeline. If you are familiar with snakemake, you can retrieve the fastqc.rules and config.yaml files and then execute the pipeline yourself with specific parameters:
snakemake -s fastqc.rules --cores 4 --stats stats.txt
Or use sequanix interface.
Please see the Wiki for more examples and features.
Tutorial
You can retrieve test data from sequana_fastqc (https://github.com/sequana/fastqc) or type:
wget https://raw.githubusercontent.com/sequana/fastqc/master/sequana_pipelines/fastqc/data/data_R1_001.fastq.gz wget https://raw.githubusercontent.com/sequana/fastqc/master/sequana_pipelines/fastqc/data/data_R2_001.fastq.gz
then, prepare the pipeline:
sequana_fastqc --input-directory . cd fastqc sh fastq.sh # once done, remove temporary files (snakemake and others) make clean
Just open the HTML entry called summary.html. A multiqc report is also available. You will get expected images such as the following one:
Please see the Wiki for more examples and features.
Requirements
This pipelines requires the following executable(s):
- fastqc
- falco (optional)
- sequana (Python: pip install sequana)
For Linux users, we provide a singularity image available through damona:
pip install damona damona install fastqc # and add the ~/.config/damona/bin path to your binary PATH
Details
This pipeline runs fastqc in parallel on the input fastq files (paired or not) and then execute multiqc. A brief sequana summary report is also produced. s You may use falco instead of fastqc. This is experimental but seem to work for Illumina/FastQ files.
This pipeline has been tested on several hundreds of MiSeq, NextSeq, MiniSeq, ISeq100, Pacbio runs.
It produces a md5sum of your data. It copes with empty samples. Produces ready-to-use HTML reports, etc
Rules and configuration details
Here is the latest documented configuration file to be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file.
Changelog
Version | Description |
---|---|
1.4.2 |
|
1.4.1 |
|
1.4.0 |
|
1.3.0 |
|
1.2.0 |
|
1.1.0 |
|
1.0.1 |
|
1.0.0 |
|
0.9.15 |
|
0.9.14 |
|
0.9.13 |
|
0.9.12 |
|
0.9.11 |
|
0.9.10 |
|
0.9.9 |
|
0.9.8 |
|
0.9.7 |
|
0.9.6 | add the readtag option |
Contribute & Code of Conduct
To contribute to this project, please take a look at the Contributing Guidelines first. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.