Skip to main content

Transform clinical dataframes into publication-ready, beautifully styled tables for medical journals and manuscripts.

Project description

clinipub

PyPI release GitHub release Tests

A lightweight Python toolkit that turns clinical pandas.DataFrame datasets into publication-ready Table 1 summaries and missing-data audits.

clinipub automates clinical descriptive analytics, standard test selection, and publication-quality output for medical research manuscripts, registries, and regulatory reporting.

Why this package exists

Baseline tables and missingness audits are essential in clinical research, but most teams still prepare them manually. clinipub delivers a reproducible, data-first workflow for clinical scientists and medical writers.

Key features

  • MissingDataAuditor: compute missingness, generate a styled HTML report, and run Little's MCAR test
  • ClinicalDataAuditor: detect categorical vs continuous variables and test continuous normality
  • BivariateTestSelector: select the appropriate clinical test automatically for continuous and categorical comparisons
  • TableOneAssembler: assemble a stratified Table 1 with descriptive statistics and p-values

Installation

Install from PyPI:

pip install clinipub

Install from source for development:

git clone https://github.com/arsalananwar11/clinipub.git
cd clinipub
uv sync

Quick start

import pandas as pd
from clinipub import (
    MissingDataAuditor,
    ClinicalDataAuditor,
    BivariateTestSelector,
    TableOneAssembler,
)

# Load clinical data

df = pd.read_csv("baseline.csv")

# Audit missing data and run Little's MCAR test
missing_auditor = MissingDataAuditor(df)
missing_df = missing_auditor.calculate_missingness()
mcar_results = missing_auditor.run_mcar_test()

html_report = missing_auditor.to_html_report(
    audit_df=missing_df,
    thresholds={"low": 1.0, "mid": 20.0},
)

# Detect variable types and assess normality
auditor = ClinicalDataAuditor(df)
var_types = auditor.detect_variable_types()
normality = auditor.test_normality(var_types["continuous"])

# Select and execute bivariate tests
selector = BivariateTestSelector(df, stratify_by="treatment")
continuous_result = selector.test_continuous("age", is_normal=normality["age"])
cat_result = selector.test_categorical("smoker_status")

# Assemble publication-ready Table 1
assembler = TableOneAssembler(df, stratify_by="treatment")
table1 = assembler.build()

API overview

  • MissingDataAuditor(data: pd.DataFrame)

    • calculate_missingness() → DataFrame with missing_count and missing_percentage
    • to_html_report(audit_df: pd.DataFrame = None, thresholds: dict = {'low': 5.0, 'mid': 30.0}) → styled HTML output
    • run_mcar_test() → dict with statistic, p_value, and degrees_of_freedom
  • ClinicalDataAuditor(data: pd.DataFrame)

    • detect_variable_types(max_categorical_threshold: int = 10){'categorical': [...], 'continuous': [...]}
    • test_normality(continuous_cols: list, alpha: float = 0.05) → dict mapping column names to booleans
  • BivariateTestSelector(data: pd.DataFrame, stratify_by: str)

    • test_continuous(col: str, is_normal: bool){'p_value': float, 'test': str}
    • test_categorical(col: str){'p_value': float, 'test': str}
  • TableOneAssembler(data: pd.DataFrame, stratify_by: str, columns: list = None)

    • build() → styled pandas.DataFrame with stratified descriptive statistics and p-values

Developer workflow

Run tests with:

uv run pytest

References

  • Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association.
  • CDISC, STROBE, and medical publication best practices for clinical summary reporting.

License

MIT License.

Contributing

Contributions are welcome. Please read CONTRIBUTING.md for issue and pull request guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clinipub-0.1.3.tar.gz (74.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clinipub-0.1.3-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file clinipub-0.1.3.tar.gz.

File metadata

  • Download URL: clinipub-0.1.3.tar.gz
  • Upload date:
  • Size: 74.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clinipub-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0dc1bdae75288995d79ce25741457408b5b2c63c2e7711a603170c62f7de4ddb
MD5 9a91e70e770a66dfd8603a1f5e436c31
BLAKE2b-256 a195409216624f28b697931c870e5135064e1e7cefb1552ab45cb7d99b0f7ccf

See more details on using hashes here.

Provenance

The following attestation bundles were made for clinipub-0.1.3.tar.gz:

Publisher: pypi-publish.yml on arsalananwar11/clinipub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clinipub-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: clinipub-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clinipub-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6a8256666c28fcc797659d342158a43f29de4cca0a7d5630430cc7156de01b4b
MD5 b3a8b4ab2e8a231a1d8531ee1eaa6c2d
BLAKE2b-256 76b3686014180f418ca8d6a4e6550bd59dd1664a6d53fe13184ce409ab1a824a

See more details on using hashes here.

Provenance

The following attestation bundles were made for clinipub-0.1.3-py3-none-any.whl:

Publisher: pypi-publish.yml on arsalananwar11/clinipub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page