Skip to main content

Quick validation of bioinformatics files

Project description

Biovalid


Quick validation of bioinformatics files


pipeline logo

Pipeline information

  • Author(s): Gino Raaijmakers
  • Organization: Rijksinstituut voor Volksgezondheid en Milieu (RIVM)
  • Department: Infektieziekteonderzoek, Diagnostiek en Laboratorium Surveillance (IDS), Informatiebeheer (IBR)
  • Start date: 23 - 07 - 2025

About this project

Biovalid is a lightweight Python library and CLI tool for fast, robust validation of bioinformatics files such as BAM, FASTA, and FASTQ. It checks file integrity, headers, and format compliance, helping users catch common issues before downstream analysis.


Features

  • File Format Support: Validate BAM, FASTA, and FASTQ files.
  • Lightweight: No dependencies.
  • Dual Usage: Use as a CLI tool or import as a Python library.
  • Customizable: Enable verbose logging, save logs to a file, or return boolean results.
  • Extensible: Designed to support additional file formats in the future.

Installation

Conda

conda create -n biovalid python>=3.10
conda activate biovalid
pip install biovalid

Pip

pip install biovalid

Parameters & Usage

Command-line help

python3 -m biovalid --help

Required parameters

  • -i, --input Path to the file or directory to validate

Optional parameters

  • -v, --verbose Enable verbose logging
  • -l, --log_file Path to a log file
  • -b, --bool_mode Return True/False instead of raising exceptions

Example command

python3 -m biovalid -i /path/to/file.bam

Library usage

from biovalid import BioValidator

validator = BioValidator(file_paths="/path/to/file.bam", verbose=True)
validator.validate_files()

Output

  • Logging: Validation results and errors are printed to the console and optionally saved to a log file.
  • Return values: In bool mode, returns True if all files are valid, False otherwise.

Issues


Future ideas

  • Add support for more file formats (e.g., VCF, GFF).
  • Improve error messages and reporting.
  • Make the tool more user-friendly for external users.

License

This project is licensed under the AGPL-3.0 license. See the LICENSE file for details.


Contact


Acknowledgements

Thanks to the IDS and IBR teams at RIVM for their support and feedback.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biovalid-0.4.0.tar.gz (2.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biovalid-0.4.0-py3-none-any.whl (58.1 kB view details)

Uploaded Python 3

File details

Details for the file biovalid-0.4.0.tar.gz.

File metadata

  • Download URL: biovalid-0.4.0.tar.gz
  • Upload date:
  • Size: 2.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biovalid-0.4.0.tar.gz
Algorithm Hash digest
SHA256 5a804a2965cadd87d2747f210380ab859513a3f61934517a01f5025a08b0a1be
MD5 374a4a9b5fae13fac5518905cc2a6848
BLAKE2b-256 4a2154148127d1250906e078c074dd4b0698bc6630d266726bfada04c9b67281

See more details on using hashes here.

Provenance

The following attestation bundles were made for biovalid-0.4.0.tar.gz:

Publisher: release-please.yml on RIVM-bioinformatics/biovalid

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file biovalid-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: biovalid-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 58.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biovalid-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e1bc8bee80baa2fd2c57d4f742dbf8e8b99f677392de4ff78456c4e1dd36c71
MD5 248dada11a91b0d5146d26059d078456
BLAKE2b-256 97ba65d6739ac1df55791b7dd39f9a81794fb8e0212cba884a3eb24ecaf6fad6

See more details on using hashes here.

Provenance

The following attestation bundles were made for biovalid-0.4.0-py3-none-any.whl:

Publisher: release-please.yml on RIVM-bioinformatics/biovalid

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page