Quick validation of bioinformatics files
Project description
Biovalid
Quick validation of bioinformatics files
Pipeline information
- Author(s): Gino Raaijmakers
- Organization: Rijksinstituut voor Volksgezondheid en Milieu (RIVM)
- Department: Infektieziekteonderzoek, Diagnostiek en Laboratorium Surveillance (IDS), Informatiebeheer (IBR)
- Start date: 23 - 07 - 2025
About this project
Biovalid is a lightweight Python library and CLI tool for fast, robust validation of bioinformatics files such as BAM, FASTA, and FASTQ. It checks file integrity, headers, and format compliance, helping users catch common issues before downstream analysis.
Features
- File Format Support: Validate BAM, FASTA, and FASTQ files.
- Lightweight: No dependencies.
- Dual Usage: Use as a CLI tool or import as a Python library.
- Customizable: Enable verbose logging, save logs to a file, or return boolean results.
- Extensible: Designed to support additional file formats in the future.
Installation
Conda
conda create -n biovalid python>=3.10
conda activate biovalid
pip install biovalid
Pip
pip install biovalid
Parameters & Usage
Command-line help
python3 -m biovalid --help
Required parameters
-i, --inputPath to the file or directory to validate
Optional parameters
-v, --verboseEnable verbose logging-l, --log_filePath to a log file-b, --bool_modeReturn True/False instead of raising exceptions
Example command
python3 -m biovalid -i /path/to/file.bam
Library usage
from biovalid import BioValidator
validator = BioValidator(file_paths="/path/to/file.bam", verbose=True)
validator.validate_files()
Output
- Logging: Validation results and errors are printed to the console and optionally saved to a log file.
- Return values: In bool mode, returns
Trueif all files are valid,Falseotherwise.
Issues
Future ideas
- Add support for more file formats (e.g., VCF, GFF).
- Improve error messages and reporting.
- Make the tool more user-friendly for external users.
License
This project is licensed under the AGPL-3.0 license. See the LICENSE file for details.
Contact
- Contact person: Gino Raaijmakers
- Email: gino.raaijmakers@rivm.nl
Acknowledgements
Thanks to the IDS and IBR teams at RIVM for their support and feedback.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file biovalid-0.2.0.tar.gz.
File metadata
- Download URL: biovalid-0.2.0.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
904960f39d9dae2feb14a61f488f4f218589ede641dc347edcb82b36008993a8
|
|
| MD5 |
936db0d66908393fe5069b730901d8ec
|
|
| BLAKE2b-256 |
de449e377dc327a0f1d03da593f9f81854872b2d4cbc20092fc70327eb5d0684
|
Provenance
The following attestation bundles were made for biovalid-0.2.0.tar.gz:
Publisher:
release-please.yml on RIVM-bioinformatics/biovalid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
biovalid-0.2.0.tar.gz -
Subject digest:
904960f39d9dae2feb14a61f488f4f218589ede641dc347edcb82b36008993a8 - Sigstore transparency entry: 439255511
- Sigstore integration time:
-
Permalink:
RIVM-bioinformatics/biovalid@6989034d1dd0030ddedac11dd75cc16ccf819686 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/RIVM-bioinformatics
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@6989034d1dd0030ddedac11dd75cc16ccf819686 -
Trigger Event:
push
-
Statement type:
File details
Details for the file biovalid-0.2.0-py3-none-any.whl.
File metadata
- Download URL: biovalid-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
369b8c7014cb349c25aef43bb2f77f429c3fd6e49146900119358c2f1e54776d
|
|
| MD5 |
596b79e6b71a25aaa2fba3f3b92149f0
|
|
| BLAKE2b-256 |
d0d4d6ffd4f549a0c7e866e883a041c126115885bb535dafabc8cc0c22d9cba4
|
Provenance
The following attestation bundles were made for biovalid-0.2.0-py3-none-any.whl:
Publisher:
release-please.yml on RIVM-bioinformatics/biovalid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
biovalid-0.2.0-py3-none-any.whl -
Subject digest:
369b8c7014cb349c25aef43bb2f77f429c3fd6e49146900119358c2f1e54776d - Sigstore transparency entry: 439255521
- Sigstore integration time:
-
Permalink:
RIVM-bioinformatics/biovalid@6989034d1dd0030ddedac11dd75cc16ccf819686 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/RIVM-bioinformatics
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@6989034d1dd0030ddedac11dd75cc16ccf819686 -
Trigger Event:
push
-
Statement type: