Skip to main content

Fast sequencing quality metrics

Project description

Sequali

Sequence quality metrics for FASTQ and uBAM files.

Features:

  • Low memory footprint, small install size and fast execution times.

  • Informative graphs that allow for judging the quality of a sequence at a quick glance.

  • Overrepresentation analysis using 21 bp sequence fragments. Overrepresented sequences are checked against the NCBI univec database.

  • Estimate duplication rate using a fingerprint subsampling technique which is also used in filesystem duplication estimation.

  • Checks for 6 illumina adapter sequences and 17 nanopore adapter sequences.

  • Per tile quality plots for illumina reads.

  • Channel and other plots for nanopore reads.

  • FASTQ and unaligned BAM are supported. See “Supported formats”.

Example reports:

For more information check the documentation.

Supported formats

  • FASTQ. Only the Sanger variation with a phred offset of 33 and the error rate calculation of 10 ^ (-phred/10) is supported. All sequencers use this format today.

    • For sequences called by illumina base callers an additional plot with the per tile quality will be provided.

    • For sequences called by guppy additional plots for nanopore specific data will be provided.

  • unaligned BAM. Any alignment flags are currently ignored.

    • For uBAM data as delivered by dorado additional nanopore plots will be provided.

Installation

Installation via pip is available with:

pip install sequali

Sequali is also distributed via bioconda. It can be installed with:

conda install -c conda-forge -c bioconda sequali

Quickstart

sequali path/to/my.fastq.gz

This will create a report my.fastq.gz.html and a json my.fastq.gz.json in the current working directory.

For all command line options checkout the usage documentation.

For more extensive information about the module options check the documentation on the module options.

Acknowledgements

  • FastQC for its excellent selection of relevant metrics. For this reason these metrics are also gathered by Sequali.

  • The matplotlib team for their excellent work on colormaps. Their work was an inspiration for how to present the data and their RdBu colormap is used to represent quality score data. Check their writings on colormaps for a good introduction.

  • Wouter de Coster for his excellent post on how to correctly average phred scores.

  • Marcel Martin for providing very extensive feedback.

License

This project is licensed under the GNU Affero General Public License v3. Mainly to avoid commercial parties from using it without notifying the users that they can run it themselves. If you want to include code from Sequali in your open source project, but it is not compatible with the AGPL, please contact me and we can discuss a separate license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequali-0.7.1.tar.gz (567.5 kB view hashes)

Uploaded Source

Built Distributions

sequali-0.7.1-cp312-cp312-win_amd64.whl (544.1 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

sequali-0.7.1-cp312-cp312-musllinux_1_1_x86_64.whl (549.0 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ x86-64

sequali-0.7.1-cp312-cp312-musllinux_1_1_aarch64.whl (545.3 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ ARM64

sequali-0.7.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.6 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

sequali-0.7.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (544.7 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

sequali-0.7.1-cp312-cp312-macosx_11_0_arm64.whl (537.7 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

sequali-0.7.1-cp312-cp312-macosx_10_9_x86_64.whl (543.1 kB view hashes)

Uploaded CPython 3.12 macOS 10.9+ x86-64

sequali-0.7.1-cp311-cp311-win_amd64.whl (544.1 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

sequali-0.7.1-cp311-cp311-musllinux_1_1_x86_64.whl (548.6 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

sequali-0.7.1-cp311-cp311-musllinux_1_1_aarch64.whl (545.0 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

sequali-0.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.1 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

sequali-0.7.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (544.4 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

sequali-0.7.1-cp311-cp311-macosx_11_0_arm64.whl (537.6 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

sequali-0.7.1-cp311-cp311-macosx_10_9_x86_64.whl (543.3 kB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

sequali-0.7.1-cp310-cp310-win_amd64.whl (544.1 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

sequali-0.7.1-cp310-cp310-musllinux_1_1_x86_64.whl (548.7 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

sequali-0.7.1-cp310-cp310-musllinux_1_1_aarch64.whl (545.0 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

sequali-0.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.2 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

sequali-0.7.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (544.4 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

sequali-0.7.1-cp310-cp310-macosx_11_0_arm64.whl (537.5 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

sequali-0.7.1-cp310-cp310-macosx_10_9_x86_64.whl (543.0 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

sequali-0.7.1-cp39-cp39-win_amd64.whl (544.2 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

sequali-0.7.1-cp39-cp39-musllinux_1_1_x86_64.whl (548.7 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

sequali-0.7.1-cp39-cp39-musllinux_1_1_aarch64.whl (545.0 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ ARM64

sequali-0.7.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.2 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

sequali-0.7.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (544.4 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

sequali-0.7.1-cp39-cp39-macosx_11_0_arm64.whl (537.5 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

sequali-0.7.1-cp39-cp39-macosx_10_9_x86_64.whl (543.0 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

sequali-0.7.1-cp38-cp38-win_amd64.whl (544.2 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

sequali-0.7.1-cp38-cp38-musllinux_1_1_x86_64.whl (548.7 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

sequali-0.7.1-cp38-cp38-musllinux_1_1_aarch64.whl (545.0 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ ARM64

sequali-0.7.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.2 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

sequali-0.7.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (544.4 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

sequali-0.7.1-cp38-cp38-macosx_11_0_arm64.whl (537.5 kB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

sequali-0.7.1-cp38-cp38-macosx_10_9_x86_64.whl (543.0 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page