A package for survival analysis with body composition analysis data
Project description
Survival Analysis Package
A Python package for analyzing survival data with a focus on body composition assessment. It was designed to utilize the results obtained by the BOA - Body and Organ Analysis workflow. In this repository we provide tools to reorganize the result of this algorithm to merge it to the patient table, add tools for data cleaning and a lifelines wrapper for automatical explorative anaylsis on survival outcomes given the Body-Composition results.
Features
- Survival Analysis: Cox proportional hazards regression and Kaplan-Meier survival curves
- Body Composition Analysis: Tools for processing and analyzing BCA data
- BOA Extractor: Command-line tool for extracting measurements from BOA data
- Data Preprocessing: Utilities for cleaning and preparing survival data
- CLI Tools: Command-line utilities for data merging, format conversion, and PDF encryption
Installation
pip install bca-survival
Usage
Basic Survival Analysis
from bca_survival.analyzer import BCASurvivalAnalyzer
# Load your data, sharing the same identifiers
df_main = pd.read_csv('clinical_data.csv')
df_measurements = pd.read_csv('bca_measurements.csv')
# Initialize the analyzer
analyzer = BCASurvivalAnalyzer(
df_main, df_measurements,
main_id_col='patient_id', measurement_id_col='id',
start_date_col='diagnosis_date', event_date_col='event_date', event_col='event_status'
)
# Perform univariate analysis
columns = ['l5::WL::imat::mean_ml', 'l5::WL::tat::mean_ml', 'age', 'gender']
results = analyzer.univariate_cox_regression(columns)
# Generate Kaplan-Meier plot
analyzer.kaplan_meier_plot('l5::WL::imat::mean_ml', split_strategy='median')
# Perform multivariate analysis
model = analyzer.multivariate_cox_regression(columns)
Command-Line Tools
The package includes several command-line tools for common data processing tasks:
BOA Extractor
Extract measurements from BOA (Body Composition Assessment) data:
boa-extract /path/to/data /path/to/output
Purpose: Processes BOA data files and extracts relevant measurements for survival analysis.
Arguments:
data_path: Path to the directory containing BOA data filesoutput_path: Path where extracted measurements will be saved
BCA Merger
Merge two Excel files based on ID columns:
bca-merge <first_file> <second_file> <id_column_name>
Purpose: Combines clinical data with body composition measurements by matching on ID columns.
Arguments:
first_file: Path to the first Excel file (e.g., clinical data)second_file: Path to the second Excel file (e.g., BCA measurements)id_column_name: Name of the ID column in the first file to match with 'StudyID' in the second file
Example:
bca-merge clinical_data.xlsx bca_measurements.xlsx patient_id
Output: Creates a file named {first_file}_merged.xlsx with:
- All rows from both files (outer join)
- Matched records combined into single rows
- Date columns formatted as DD.MM.YYYY
- No duplicate StudyID columns
Notes:
- The second file must have a column named 'StudyID'
- Uses outer merge to preserve all data from both files
- Automatically removes duplicate ID columns
Survival Result Converter
Convert Excel files to multiple formats (PDF, CSV, TXT):
survival-result-converter [directory]
Purpose: Batch converts Excel files to multiple formats for reporting and data sharing.
Arguments:
directory: Directory to scan for Excel files (default: current directory)
Example:
# Convert all Excel files in current directory
survival-result-converter
# Convert Excel files in specific directory
survival-result-converter /path/to/results
Output Structure:
directory/
├── PDF/
│ ├── file1.pdf
│ └── file2.pdf
├── CSV/
│ ├── file1.csv
│ ├── file2_sheet1.csv
│ └── file2_sheet2.csv
└── TXT/
├── file1.txt
└── file2.txt
Features:
- Recursively processes all
.xlsxfiles in the directory tree - Creates separate output folders (PDF, CSV, TXT)
- For multi-sheet Excel files:
- PDF: All sheets in single file
- CSV: Separate file per sheet
- TXT: All sheets in single file with separators
- PDF generation supports two methods:
- Windows: Uses COM automation for high-quality output
- Cross-platform: Uses fpdf library with automatic column sizing
PDF Features:
- Landscape orientation for better table visibility
- Automatic column width adjustment
- Fits tables to page width
- Handles large tables (up to 1000 rows per sheet)
- Text wrapping for long content
PDF Report Extractor
Encrypt and organize PDF files from a directory tree:
pdf-report-extractor <input_path> <output_path> <password>
Purpose: Finds PDF files in a directory structure, copies them with standardized names, and encrypts them for secure distribution.
Arguments:
input_path: Root directory to search for PDF filesoutput_path: Destination directory for encrypted PDFspassword: Password to encrypt the PDFs with
Example:
pdf-report-extractor /data/patient_reports /encrypted_reports MySecureP@ss123
Behavior:
- Recursively searches for all
.pdffiles - Copies files to destination with naming pattern:
encrypted_{parent_folder_name}.pdf - Encrypts each file using user password protection
- Requires
pdftkto be installed
Check pdftk Installation:
pdf-report-extractor --check-pdftk
Installing pdftk:
- Ubuntu/Debian:
sudo apt-get install pdftk - macOS:
brew install pdftk-java - Windows: Download from PDFtk website
Output Summary:
Processing: /data/patient_reports/folder1/report.pdf
-> /encrypted_reports/encrypted_folder1.pdf
-> Encrypted successfully
Processing complete:
- Files processed successfully: 15
- Errors: 0
Notes:
- Original files remain unchanged
- If encryption fails, the unencrypted copy is removed from destination
- Parent folder name is used for output filename (one level up from the PDF)
Documentation
Refer to the documentation in the docs/ directory for detailed information:
- Install the package with documentation dependencies:
pip install -e ".[docs]"
- Build the documentation on Windows:
cd docs
make.bat html
Or on Linux/macOS:
cd docs
make html
- Open
docs/build/html/index.htmlin your browser
Development
Clone the repository and install in development mode:
git clone https://gitlab.com/your-group/survival-analysis.git
cd survival-analysis
pip install -e ".[dev]"
Requirements
Core Dependencies
- pandas
- openpyxl (for Excel file handling)
- lifelines (for survival analysis)
Optional Dependencies
- For PDF conversion (survival-result-converter):
- Windows: pywin32
- Cross-platform: fpdf, openpyxl
- For PDF encryption (pdf-report-extractor):
- pdftk (external dependency)
Common Workflows
Workflow 1: Complete Data Processing Pipeline
# 1. Merge clinical and BCA data
bca-merge clinical.xlsx measurements.xlsx PatientID
# 2. Perform survival analysis (Python)
# ... (use BCASurvivalAnalyzer)
# 3. Convert results to multiple formats
survival-result-converter ./results
# 4. Encrypt PDF reports for distribution
pdf-report-extractor ./results/PDF ./encrypted_reports SecurePassword123
Workflow 2: Quick Data Conversion
# Convert a directory of Excel results to PDF
survival-result-converter /path/to/results
# PDFs are created in /path/to/results/PDF/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bca_survival-0.1.0.tar.gz.
File metadata
- Download URL: bca_survival-0.1.0.tar.gz
- Upload date:
- Size: 42.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4176d5028427b08793366af9f7d4322f7b635b94ce943304733b33567f17c2de
|
|
| MD5 |
8f06244d55d97f84fcd65af1494616b9
|
|
| BLAKE2b-256 |
04f06b777964e9ed914efbcf5176965fc18e2bdbdbaaa18a8473bce32f8bfde0
|
Provenance
The following attestation bundles were made for bca_survival-0.1.0.tar.gz:
Publisher:
release.yml on eFroD/bca-survival-analyzer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bca_survival-0.1.0.tar.gz -
Subject digest:
4176d5028427b08793366af9f7d4322f7b635b94ce943304733b33567f17c2de - Sigstore transparency entry: 754703221
- Sigstore integration time:
-
Permalink:
eFroD/bca-survival-analyzer@6545a49fbd3e4e01f0459f550dc70177b2a445fa -
Branch / Tag:
refs/tags/v0.2.5 - Owner: https://github.com/eFroD
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@6545a49fbd3e4e01f0459f550dc70177b2a445fa -
Trigger Event:
push
-
Statement type:
File details
Details for the file bca_survival-0.1.0-py3-none-any.whl.
File metadata
- Download URL: bca_survival-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8f122ae5f94efbf15fd600d3b9bbb2de9cf12414a6a36bf1a35e3f259162d7f
|
|
| MD5 |
0c5c74d42d9575c4858c46d9708e1318
|
|
| BLAKE2b-256 |
5d4a3d47bfc4c15a637d4d8fd8dc34abf982360440189c09c84edffbfbb387cc
|
Provenance
The following attestation bundles were made for bca_survival-0.1.0-py3-none-any.whl:
Publisher:
release.yml on eFroD/bca-survival-analyzer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bca_survival-0.1.0-py3-none-any.whl -
Subject digest:
e8f122ae5f94efbf15fd600d3b9bbb2de9cf12414a6a36bf1a35e3f259162d7f - Sigstore transparency entry: 754703241
- Sigstore integration time:
-
Permalink:
eFroD/bca-survival-analyzer@6545a49fbd3e4e01f0459f550dc70177b2a445fa -
Branch / Tag:
refs/tags/v0.2.5 - Owner: https://github.com/eFroD
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@6545a49fbd3e4e01f0459f550dc70177b2a445fa -
Trigger Event:
push
-
Statement type: