Skip to main content

Microbial Quality Control Dashboard

Project description

uQCme - Microbial Quality Control Tool

Version Python License

A comprehensive quality control (QC) tool for microbial sequencing data that provides both command-line processing and interactive web-based visualization capabilities.

Overview

uQCme consists of two main components:

  1. CLI Tool (uqcme): A command-line interface that processes microbial sequencing QC data against configurable quality control rules. It determines QC outcomes (PASS/FAIL/WARNING) based on species-specific criteria.
  2. Web Dashboard (uqcme-dashboard): An interactive Streamlit application for visualizing and exploring the QC results generated by the CLI tool.

Features

Core Functionality

  • Species-specific QC rules: Support for numerous microbial species with tailored quality control criteria defined in QC_rules.tsv.
  • Configurable QC tests: Define custom QC outcomes with priority-based rule conditions.
  • Flexible rule engine: Regex-based validation with threshold checks for various QC metrics.
  • Robust Validation: Data validation using Pandera schemas and Pydantic configuration management.
  • Interactive dashboard: Web-based visualization with filtering, sorting, and detailed sample exploration.
  • Comprehensive logging: Detailed logging system with both file and console output.

Supported Species

The tool handles specific QC rules for the following species (as defined in the default QC_rules.tsv):

  • Acinetobacter baumannii
  • Campylobacter coli
  • Campylobacter jejuni
  • Enterobacter spp.
  • Enterococcus faecium
  • Escherichia coli
  • Haemophilus influenzae
  • Helicobacter pylori
  • Klebsiella pneumoniae
  • Mycoplasma genitalium
  • Neisseria gonorrhoeae
  • Pseudomonas aeruginosa
  • Salmonella enterica
  • Shigella flexneri
  • Shigella sonnei
  • Staphylococcus aureus
  • Streptococcus pneumoniae

Note: "all" rules apply to any species not explicitly listed or as general baseline checks.

QC Metrics

  • Assembly statistics (N50, contigs, genome size)
  • CheckM completeness and contamination
  • Species identification validation
  • Coverage depth analysis
  • Quality score assessments

Installation

From PyPI

Core only (shared logic):

pip install uqcme

CLI only:

pip install uqcme[cli]

Dashboard/App only:

pip install uqcme[app]

Full installation (CLI + Web Dashboard):

pip install uqcme[all]

From Source

git clone https://github.com/ssi-dk/uQCme.git
cd uQCme

# Core only
pip install .

# Full installation
pip install ".[all]"

Usage

1. CLI Tool (uqcme)

The CLI tool reads your sequencing run data and applies the QC rules defined in your configuration.

Basic Usage (using defaults):

uqcme

This will use the bundled default configuration and look for input files in the current directory as specified in the default config.

Override Data Source: You can process a specific data file or API endpoint without creating a full config file:

# Process a local file
uqcme --file path/to/my_run_data.tsv

# Process data from an API
uqcme --api-call "https://api.example.com/runs/123"

Custom Configuration: For full control over rules, tests, and mappings, provide a custom configuration file:

uqcme --config my_config.yaml

What it does:

  1. Loads run data (from file, API, or defaults).
  2. Loads QC rules (QC_rules.tsv) and QC tests (QC_tests.tsv).
    • Note: If local rule files are missing, it falls back to bundled defaults.
  3. Evaluates each sample against the rules for its species.
  4. Determines the final QC outcome (e.g., PASS, FAIL).
  5. Outputs a new TSV file containing the original data plus the QC results (e.g., uQCme_run_data.tsv).

2. Web Dashboard (uqcme-dashboard)

The dashboard visualizes the results generated by the CLI tool.

uqcme-dashboard

Command:

uqcme-dashboard --config config.yaml

What it does:

  1. Launches a local web server (Streamlit).
  2. Loads the processed data (from file or API) as specified in config.yaml.
  3. Provides an interactive interface to:
    • View summary statistics.
    • Filter samples by QC outcome, species, or specific metrics.
    • Inspect individual sample details and failed rules.
    • Visualize metric distributions.

Configuration

The tool is driven by a config.yaml file. This file defines:

  • Input paths: Locations of your data, mapping file, and QC rules/tests.
  • Output paths: Where to save the results and logs.
  • App settings: Dashboard configuration (server port, UI preferences).

Key Input Files:

See the input/example/ directory for template files.

Output Files

1. QC Results (uQCme_run_data.tsv)

Enhanced run data with QC outcomes:

  • Original sample data
  • Failed rules per sample
  • Assigned QC outcome with priority
  • Color coding for visualization

2. Rule Warnings (uQCme_rule_warnings.tsv)

Detailed log of rule evaluation issues:

  • Skipped rules and reasons
  • Data validation warnings
  • Processing statistics

Dashboard Features

Data Overview

  • Interactive data table with filtering and sorting
  • Priority-based color coding of QC outcomes
  • Summary statistics and sample counts

Sample Details

  • Detailed view of individual sample QC results
  • Failed rules and thresholds
  • Interactive metric exploration

Visualization

  • Plotly-based interactive charts
  • Customizable metric comparisons
  • Species-specific analysis views

Filtering and Search

  • Dynamic filtering by QC outcome, species, and metrics
  • Search functionality across all data columns
  • Export capabilities for filtered datasets

Advanced Usage

Custom QC Rules

Create custom QC rules by editing QC_rules.tsv:

rule_id	species	qc_tool	qc_metric	validation_type	threshold	column_name
CUSTOM1	Escherichia coli	Assembly	N50	threshold	>=50000	n50
CUSTOM2	all	CheckM	Completeness	threshold	>=90	completeness

Species-Specific Tests

Define new QC tests in QC_tests.tsv:

outcome_id	outcome_name	description	priority	rule_conditions	action_required
FAIL_CUSTOM	Fail - Custom QC	Custom quality control failed	3	failed_rules_contain:CUSTOM1,CUSTOM2	reject

Development

This project uses pixi for dependency management and development workflow.

To set up the development environment:

pixi install

To run tests:

pixi run pytest

Project Structure

uQCme/
├── src/
│   └── uQCme/
│       ├── __init__.py
│       ├── app.py          # Streamlit web dashboard
│       ├── cli.py          # CLI processing tool
│       ├── plot.py         # Plotting utilities
│       └── utils.py        # Shared utilities
├── config.yaml             # Configuration file
├── input/
│   └── example/            # Example input files
├── output/                 # Generated results
├── log/                    # Application logs
└── tests/                  # Unit tests

Running Tests

python -m pytest tests/

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

Troubleshooting

Common Issues

  1. Missing input files: Ensure all required input files exist and paths in config.yaml are correct
  2. Rule validation errors: Check that QC rules reference valid column names in your data
  3. Dashboard not loading: Verify Streamlit installation and port availability

Logging

Check the log file (./log/log.tsv) for detailed processing information:

tail -f ./log/log.tsv

Citation

If you use uQCme in your research, please cite:

uQCme: A Comprehensive Quality Control Tool for Microbial Sequencing Data
SSI-DK, 2025

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions, issues, or feature requests:

  • Create an issue on GitHub
  • Contact: Kim Ng (kimn@ssi.dk)

Changelog

See CHANGELOG.md for version history and updates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uqcme-0.8.1.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uqcme-0.8.1-py3-none-any.whl (48.0 kB view details)

Uploaded Python 3

File details

Details for the file uqcme-0.8.1.tar.gz.

File metadata

  • Download URL: uqcme-0.8.1.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for uqcme-0.8.1.tar.gz
Algorithm Hash digest
SHA256 1684910cf235c85a07c84d35d0fd12914131dae2730862dd3b12ed3a15176b09
MD5 2421ca23eb0e054c4c28e8c792e6c96e
BLAKE2b-256 8e3ff7a419308dc2bf70c5da46b128e874659953ec2ca12024acf6ecf9a4cc9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for uqcme-0.8.1.tar.gz:

Publisher: publish.yml on ssi-dk/uQCme

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file uqcme-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: uqcme-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 48.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for uqcme-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 13b5777c29b342fb5a5f1b9b064ee91b893a63965b793c1e131eca60f3eb858f
MD5 fa3b042ebe55481b9ded064a5263b26c
BLAKE2b-256 64593f83807d72dbc3773e85425657c6361ea0584be68e1d6d85eeb47ba095ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for uqcme-0.8.1-py3-none-any.whl:

Publisher: publish.yml on ssi-dk/uQCme

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page