Skip to main content

Quick validation of bioinformatics files

Project description

Biovalid


Quick validation of bioinformatics files


pipeline logo

Pipeline information

  • Author(s): Gino Raaijmakers
  • Organization: Rijksinstituut voor Volksgezondheid en Milieu (RIVM)
  • Department: Infektieziekteonderzoek, Diagnostiek en Laboratorium Surveillance (IDS), Informatiebeheer (IBR)
  • Start date: 23 - 07 - 2025

About this project

Biovalid is a lightweight Python library and CLI tool for fast, robust validation of bioinformatics files such as BAM, FASTA, and FASTQ. It checks file integrity, headers, and format compliance, helping users catch common issues before downstream analysis.


Features

  • File Format Support: Validate BAM, FASTA, and FASTQ files.
  • Lightweight: No dependencies.
  • Dual Usage: Use as a CLI tool or import as a Python library.
  • Customizable: Enable verbose logging, save logs to a file, or return boolean results.
  • Extensible: Designed to support additional file formats in the future.

Installation

Conda

conda create -n biovalid python>=3.10
conda activate biovalid
pip install biovalid

Pip

pip install biovalid

Parameters & Usage

Command-line help

python3 -m biovalid --help

Required parameters

  • -i, --input Path to the file or directory to validate

Optional parameters

  • -v, --verbose Enable verbose logging
  • -l, --log_file Path to a log file
  • -b, --bool_mode Return True/False instead of raising exceptions

Example command

python3 -m biovalid -i /path/to/file.bam

Library usage

from biovalid import BioValidator

validator = BioValidator(file_paths="/path/to/file.bam", verbose=True)
validator.validate_files()

Output

  • Logging: Validation results and errors are printed to the console and optionally saved to a log file.
  • Return values: In bool mode, returns True if all files are valid, False otherwise.

Issues


Future ideas

  • Add support for more file formats (e.g., VCF, GFF).
  • Improve error messages and reporting.
  • Make the tool more user-friendly for external users.

License

This project is licensed under the AGPL-3.0 license. See the LICENSE file for details.


Contact


Acknowledgements

Thanks to the IDS and IBR teams at RIVM for their support and feedback.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biovalid-0.2.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biovalid-0.2.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file biovalid-0.2.0.tar.gz.

File metadata

  • Download URL: biovalid-0.2.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for biovalid-0.2.0.tar.gz
Algorithm Hash digest
SHA256 904960f39d9dae2feb14a61f488f4f218589ede641dc347edcb82b36008993a8
MD5 936db0d66908393fe5069b730901d8ec
BLAKE2b-256 de449e377dc327a0f1d03da593f9f81854872b2d4cbc20092fc70327eb5d0684

See more details on using hashes here.

Provenance

The following attestation bundles were made for biovalid-0.2.0.tar.gz:

Publisher: release-please.yml on RIVM-bioinformatics/biovalid

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file biovalid-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: biovalid-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for biovalid-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 369b8c7014cb349c25aef43bb2f77f429c3fd6e49146900119358c2f1e54776d
MD5 596b79e6b71a25aaa2fba3f3b92149f0
BLAKE2b-256 d0d4d6ffd4f549a0c7e866e883a041c126115885bb535dafabc8cc0c22d9cba4

See more details on using hashes here.

Provenance

The following attestation bundles were made for biovalid-0.2.0-py3-none-any.whl:

Publisher: release-please.yml on RIVM-bioinformatics/biovalid

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page