Transform clinical dataframes into publication-ready, beautifully styled tables for medical journals and manuscripts.
Project description
clinipub
A lightweight Python toolkit that turns clinical pandas.DataFrame datasets into publication-ready Table 1 summaries and missing-data audits.
clinipub automates clinical descriptive analytics, standard test selection, and publication-quality output for medical research manuscripts, registries, and regulatory reporting.
Why this package exists
Baseline tables and missingness audits are essential in clinical research, but most teams still prepare them manually. clinipub delivers a reproducible, data-first workflow for clinical scientists and medical writers.
Key features
MissingDataAuditor: compute missingness, generate a styled HTML report, and run Little's MCAR testClinicalDataAuditor: detect categorical vs continuous variables and test continuous normalityBivariateTestSelector: select the appropriate clinical test automatically for continuous and categorical comparisonsTableOneAssembler: assemble a stratified Table 1 with descriptive statistics and p-values
Installation
Install from PyPI:
pip install clinipub
Install from source for development:
git clone https://github.com/arsalananwar11/clinipub.git
cd clinipub
uv sync
Quick start
import pandas as pd
from clinipub import (
MissingDataAuditor,
ClinicalDataAuditor,
BivariateTestSelector,
TableOneAssembler,
)
# Load clinical data
df = pd.read_csv("baseline.csv")
# Audit missing data and run Little's MCAR test
missing_auditor = MissingDataAuditor(df)
missing_df = missing_auditor.calculate_missingness()
mcar_results = missing_auditor.run_mcar_test()
html_report = missing_auditor.to_html_report(
audit_df=missing_df,
thresholds={"low": 1.0, "mid": 20.0},
)
# Detect variable types and assess normality
auditor = ClinicalDataAuditor(df)
var_types = auditor.detect_variable_types()
normality = auditor.test_normality(var_types["continuous"])
# Select and execute bivariate tests
selector = BivariateTestSelector(df, stratify_by="treatment")
continuous_result = selector.test_continuous("age", is_normal=normality["age"])
cat_result = selector.test_categorical("smoker_status")
# Assemble publication-ready Table 1
assembler = TableOneAssembler(df, stratify_by="treatment")
table1 = assembler.build()
API overview
-
MissingDataAuditor(data: pd.DataFrame)calculate_missingness()→ DataFrame withmissing_countandmissing_percentageto_html_report(audit_df: pd.DataFrame = None, thresholds: dict = {'low': 5.0, 'mid': 30.0})→ styled HTML outputrun_mcar_test()→ dict withstatistic,p_value, anddegrees_of_freedom
-
ClinicalDataAuditor(data: pd.DataFrame)detect_variable_types(max_categorical_threshold: int = 10)→{'categorical': [...], 'continuous': [...]}test_normality(continuous_cols: list, alpha: float = 0.05)→ dict mapping column names to booleans
-
BivariateTestSelector(data: pd.DataFrame, stratify_by: str)test_continuous(col: str, is_normal: bool)→{'p_value': float, 'test': str}test_categorical(col: str)→{'p_value': float, 'test': str}
-
TableOneAssembler(data: pd.DataFrame, stratify_by: str, columns: list = None)build()→ styledpandas.DataFramewith stratified descriptive statistics and p-values
Developer workflow
Run tests with:
uv run pytest
References
- Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association.
- CDISC, STROBE, and medical publication best practices for clinical summary reporting.
License
MIT License.
Contributing
Contributions are welcome. Please read CONTRIBUTING.md for issue and pull request guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clinipub-0.1.3.tar.gz.
File metadata
- Download URL: clinipub-0.1.3.tar.gz
- Upload date:
- Size: 74.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0dc1bdae75288995d79ce25741457408b5b2c63c2e7711a603170c62f7de4ddb
|
|
| MD5 |
9a91e70e770a66dfd8603a1f5e436c31
|
|
| BLAKE2b-256 |
a195409216624f28b697931c870e5135064e1e7cefb1552ab45cb7d99b0f7ccf
|
Provenance
The following attestation bundles were made for clinipub-0.1.3.tar.gz:
Publisher:
pypi-publish.yml on arsalananwar11/clinipub
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clinipub-0.1.3.tar.gz -
Subject digest:
0dc1bdae75288995d79ce25741457408b5b2c63c2e7711a603170c62f7de4ddb - Sigstore transparency entry: 2058165007
- Sigstore integration time:
-
Permalink:
arsalananwar11/clinipub@ac1a8d6964c8f97c118d3a3b7d87beee076e84c9 -
Branch / Tag:
refs/tags/v.0.1.3 - Owner: https://github.com/arsalananwar11
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@ac1a8d6964c8f97c118d3a3b7d87beee076e84c9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file clinipub-0.1.3-py3-none-any.whl.
File metadata
- Download URL: clinipub-0.1.3-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a8256666c28fcc797659d342158a43f29de4cca0a7d5630430cc7156de01b4b
|
|
| MD5 |
b3a8b4ab2e8a231a1d8531ee1eaa6c2d
|
|
| BLAKE2b-256 |
76b3686014180f418ca8d6a4e6550bd59dd1664a6d53fe13184ce409ab1a824a
|
Provenance
The following attestation bundles were made for clinipub-0.1.3-py3-none-any.whl:
Publisher:
pypi-publish.yml on arsalananwar11/clinipub
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clinipub-0.1.3-py3-none-any.whl -
Subject digest:
6a8256666c28fcc797659d342158a43f29de4cca0a7d5630430cc7156de01b4b - Sigstore transparency entry: 2058165329
- Sigstore integration time:
-
Permalink:
arsalananwar11/clinipub@ac1a8d6964c8f97c118d3a3b7d87beee076e84c9 -
Branch / Tag:
refs/tags/v.0.1.3 - Owner: https://github.com/arsalananwar11
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@ac1a8d6964c8f97c118d3a3b7d87beee076e84c9 -
Trigger Event:
push
-
Statement type: