Skip to main content

EZQC is a streamlined, terminal-based alternative to FastQC.

Project description

EZQC Main Tests PyPI License: MIT

EZQC: Easy Quality Control for FastQ Files

test Gif

Table of Contents

Introduction

EZQC is a streamlined, terminal-based alternative to FastQC. Instead of generating individual report files per analysis, EZQC displays the analysis results, reasons, and suggestions directly in the terminal, making it easier to quickly assess the quality of multiple files. Additionally, EZQC generates figures for each analysis, providing a visual aid to spot potential issues for further examination.

EZQC is capable of performing the following analyses:

  1. Per base sequence quality
  2. Per sequence quality scores
  3. Per base sequence content
  4. Per sequence GC content
  5. Per base N content
  6. Sequence Length Distribution
  7. Overrepresented sequences
  8. Adapter Content

Why Choose EZQC VS FastQC?

  • Fast Result Readout When Batch Processing: With EZQC, there's no need to click into each HTML report like you would with FastQC.
  • Automatic Interpretation of Analysis Results: The results are color-coded and provided in plain English, complete with suggestions. This makes it easier for users to interpret the results quickly.
  • Generate Detailed Figures for Advanced Users: For those who want a more in-depth analysis, EZQC is capable of generating detailed figures that aid in understanding the quality of your FastQ files.

Quick Start Guide

  1. Install EZQC following Installation guide.
  2. Run the tool on a toy example using the command ezqc tests/SRR020192.fastq (fastq file from IGSR).
  3. The results will be displayed in the terminal, and figures as well as csv tables will be saved to a directory named ezqc_output in your current working directory. Note that this file is choosen intentionally to fail multiple QC tests.

Installation

You can install EZQC using pip:

pip install ezqc

Alternatively, you can compile the latest version of EZQC from source using the provided setup.py script. Following steps:

  1. Clone the repository:
git clone https://github.com/skysky2333/ezqc
  1. Navigate to the EZQC directory:
cd ezqc
  1. Install the package:
pip install .

Or

python setup.py install

EZQC requires Python 3.x and depends on the following packages, which will be installed automatically during setup:

  • numpy
  • matplotlib
  • pandas
  • scipy
  • Bio

Usage

After installation, you can use EZQC from the command line as follows:

ezqc <fastq file(s)>

Replace <fastq file(s)> with the path(s) to your FastQ files. If you want to analyze multiple files, separate the file paths with spaces:

ezqc file1.fastq file2.fastq file3.fastq

Use -o or --output to set the output directory. Use -h or --help to see help messages.

Analysis Methods

Here's a brief description of the analyses performed by EZQC:

  1. Per Base Sequence Quality: Checks the quality of each base call in a sequence read.
  2. Per Sequence Quality Scores: Provides a histogram of quality scores over all sequences.
  3. Per Base Sequence Content: Analyzes the proportion of each base (A, T, G, C) at each position across all sequences.
  4. Per Sequence GC Content: Calculates the GC content in each sequence.
  5. Per Base N Content: Identifies sequences with a high proportion of unknown (N) bases.
  6. Sequence Length Distribution: Provides a histogram showing the distribution of sequence lengths.
  7. Overrepresented sequences: Identifies any sequences that occur more often than expected.
  8. Adapter Content: Detects the presence of adapter sequences in the reads.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details on how to contribute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ezqc-0.35.tar.gz (14.1 kB view hashes)

Uploaded Source

Built Distribution

ezqc-0.35-py3-none-any.whl (16.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page