Zero-dependency biomedical scoring: reproducibility audit, domain-aware data quality, and ML model readiness gate — one pip install, three checks

These details have not been verified by PyPI

Project links

Project description

bioscore

Biomedical scoring toolkit — reproducibility, data quality, and model readiness metrics for computational biology.

Why this exists

Most computational biology teams have no automated quality gates. A researcher finishes a notebook, shares it — and nobody can reproduce it. A data scientist trains a model — and it silently fails on edge cases. A team deploys to production — and there's no bias audit.

bioscore closes these gaps with three one-command checks that plug into any workflow.

Install

pip install bioscore

Requires Python 3.9+. No external dependencies for core functions.

Quick Start

from bioscore import reproducibility, data_quality, model_readiness

# 1. Check if your notebook is reproducible
reproducibility("analysis.ipynb")
# → {"score": 0.65, "issues": ["missing seed", "no version pinning"], "level": "partial"}

# 2. Assess dataset quality before training
data_quality("dataset.csv", domain="oncology")
# → {"completeness": 0.8, "consistency": 0.9, "overall": 0.85}

# 3. Verify model is ready for production
model_readiness("model.pkl")
# → {"score": 0.72, "ready": false, "gaps": ["no validation split", "no bias audit"]}

Target Audience & Daily Use

🧬 Computational Biology Researcher

Their morning: Opens Jupyter, runs yesterday's analysis on new data. Shares notebook with labmates. Submits paper.

The problem: Six months later, nobody — including themselves — can reproduce the results. Random seeds weren't set. Package versions weren't pinned. The data source was a colleague's Dropbox link that's now dead.

How bioscore helps:

from bioscore import reproducibility

result = reproducibility("my_analysis.ipynb")
if result["level"] != "full":
    print("Fix before sharing:", result["issues"])

They run this before sharing any notebook. It catches missing seeds, unpinned versions, undocumented data sources. The level field (full / partial / minimal) gives a quick pass/fail.

Install: pip install bioscore in their notebook environment (conda, venv, or Colab).

📊 Data Scientist in Pharma/Biotech

Their morning: Pulls clinical trial data. Checks for missing values. Trains a survival model. Sends to review.

The problem: Datasets have silent gaps — 30% missing in one column, inconsistent row counts, domain-specific quality rules nobody checks automatically.

How bioscore helps:

from bioscore import data_quality

result = data_quality("clinical_data.csv", domain="oncology")
if result["overall"] < 0.7:
    print(f"Quality too low ({result['overall']}), fix before training")

They run this as the first cell in every analysis notebook. Domain-aware checks (oncology, agriculture, general) apply different quality thresholds. Prevents garbage-in-garbage-out silently.

Install: pip install bioscore in their data science environment.

🚀 ML Engineer / MLOps

Their morning: Reviews model PR. Checks metrics. Approves deployment to staging. Monitors production.

The problem: Models reach production without validation splits, bias audits, or input schemas. Issues surface only in production — expensive and risky.

How bioscore helps:

from bioscore import model_readiness

result = model_readiness("model_v2.pkl")
if not result["ready"]:
    print("Block deployment:", result["gaps"])

They add this to CI/CD pipeline as a deployment gate. If ready is false, the pipeline blocks. Gaps like "no validation split" or "no bias audit" are surfaced as actionable items.

Install: Add bioscore to requirements.txt or pyproject.toml in the ML pipeline project.

API Reference

`reproducibility(source: str) -> dict`

Evaluates a notebook or script for reproducibility best practices.

Checks: random seed, package version pinning, data source documentation, output preservation, environment specification, comments, docstrings, logging.

Returns: {"score": float, "issues": list[str], "level": "full"|"partial"|"minimal"}

`data_quality(source: str, domain: str = "general") -> dict`

Assesses a CSV dataset for completeness and consistency.

Domains: "general", "oncology", "agriculture" — each applies domain-specific quality weights.

Returns: {"completeness": float, "consistency": float, "overall": float}

`model_readiness(source: str) -> dict`

Evaluates a pickled ML model artifact for production readiness.

Checks: validation split, bias audit, performance metrics, version tag, input schema, error handling, documentation, test coverage.

Returns: {"score": float, "ready": bool, "gaps": list[str]}

Innovation

bioscore is the first lightweight, zero-dependency Python toolkit that unifies three critical pre-deployment checks for computational biology:

Reproducibility scoring — not just linting, but a weighted score with actionable issues
Domain-aware data quality — oncology and agriculture have different quality standards than general data
Model readiness gate — a binary pass/fail with specific gaps, designed for CI/CD integration

No other package combines all three. Most teams cobble together custom scripts. bioscore makes it pip install bioscore and one function call.

License

MIT © K-RnD Lab

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

May 12, 2026

0.1.0

May 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioscore-0.2.0.tar.gz (6.0 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bioscore-0.2.0-py3-none-any.whl (7.2 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file bioscore-0.2.0.tar.gz.

File metadata

Download URL: bioscore-0.2.0.tar.gz
Upload date: May 12, 2026
Size: 6.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for bioscore-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`f70a7c002bf0eb06172804e2379e148eb95716fc47169990c4adad4b5388bf24`
MD5	`a399dc2b80be84eb709317c99cc6c1ad`
BLAKE2b-256	`17f1b3ee4220482fec160099ece49476498b61290e90b1f22c8dba729f5927dd`

See more details on using hashes here.

File details

Details for the file bioscore-0.2.0-py3-none-any.whl.

File metadata

Download URL: bioscore-0.2.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 7.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for bioscore-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c88bb55e9c6aba4d4dfad9833b9bdc97d7dc5447d705fad1859de9b8b9159b06`
MD5	`1bc388cb80ec289f902a88f5fc5b97a7`
BLAKE2b-256	`1a8d2f27735f429885f2bb7e5a06e67724a9c9cffb8ac24324514f9931f6559a`

See more details on using hashes here.

bioscore 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

bioscore

Why this exists

Install

Quick Start

Target Audience & Daily Use

🧬 Computational Biology Researcher

📊 Data Scientist in Pharma/Biotech

🚀 ML Engineer / MLOps

API Reference

`reproducibility(source: str) -> dict`

`data_quality(source: str, domain: str = "general") -> dict`

`model_readiness(source: str) -> dict`

Innovation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes