Skip to main content

A python wrapper around cmprsk R package

Project description

cmprsk - Competing Risks Regression

Regression modeling of sub-distribution functions in competing risks.

A python wrapper around the cmprsk R package.

Description: Estimation, testing and regression modeling of subdistribution functions in competing risks, as described in Gray (1988), A class of K-sample tests for comparing the cumulative incidence of a competing risk, Ann. Stat. 16:1141-1154, and Fine JP and Gray RJ (1999), A proportional hazards model for the subdistribution of a competing risk, JASA, 94:496-509.

Original Package documentation

Requierments

python

  • Only python 3 is now supported. Recommended python version >= 3.8

  • The original version of this package was written with rpy2 version 2.9.4. Since then, rpy2 had many breaking changes. Therefore cmprsk version 0.X.Y only works with rpy2 version 2.9.X.
  • The cmprsk package v 1.X.Y is now up-to-date and is using rpy2 3.4.5.

Installation steps

  • install R
  • install cmprsk R package: open R terminal and run install.packages("cmprsk")
  • create a virtual environment (recommended)
  • installrpy2 - if using conda for creating the virtual environment on MacOS M1 (apple silicon) install rpy2 using pip
  • install pandas
  • install scipy
  • install pytest for running unit tests (dev) only

This package is using rpy2 in order to use import the cmprsk R packge and therefore the requierments for rpy2 must be met.

TL;DR

Quickstart

For example usage consult the tutorial notebook in this repo: package_usage.ipynb

Example: crr

import pandas as pd

import cmprsk.cmprsk as cmprsk

from cmprsk import utils

data = pd.read_csv('my_data_file.csv')
# assuming that x1,x2,x3, x4 are covatiates. 
# x1 are x4 are categorical with baseline 'd' for x1 and 5 for x2 
static_covariates = utils.as_indicators(data[['x1', 'x2', 'x3', 'x4']], ['x1', 'x4'], bases=['d', 5])

crr_result = cmprsk.crr(data['ftime'], data['fstatus'], static_covariates)
report = crr_result.summary

print(report)

ftime and fstatus can be numpy array or pandas series, and static_covariates is a pandas DataFrame. The report is a pandas DataFrame as well.

Example: cuminc

import matplotlib.plt
import numpy as np
import pandas as pd


from cmprsk import cmprsk

data  = pd.read_csv('cmprsk/cmprsk/tests/test_set.csv')
cuminc_res = cmprsk.cuminc(data.ss, data.cc, group=data.gg, strata=data.strt)

# print
cuminc_res.print

# plot using matplotlib

_, ax = plt.subplots()
for name, group in cuminc_res.groups.items():
    ax.plot(group.time, group.est, label=name)
    ax.fill_between(group.time, group.low_ci, group.high_ci, alpha=0.4)
    
ax.set_ylim([0, 1])
ax.legend()
ax.set_title('foo bar')
plt.show()

Development

For running the unit tests run

pytest --cov=cmprsk cmprsk/tests/

from the project root.

Current coverage

---------- coverage: platform darwin, python 3.9.7-final-0 -----------
Name                             Stmts   Miss  Cover
----------------------------------------------------
cmprsk/__init__.py                   0      0   100%
cmprsk/cmprsk.py                   128     22    83%
cmprsk/rpy_utils.py                 44     10    77%
cmprsk/tests/__init__.py             0      0   100%
cmprsk/tests/test_cmprsk.py         30      0   100%
cmprsk/tests/test_rpy_utils.py      27      1    96%
cmprsk/tests/test_utils.py          37      0   100%
cmprsk/utils.py                     23      1    96%
----------------------------------------------------
TOTAL                              289     34    88%

How to update package:

  1. update version in setup.py
  2. rm -fr dist directory
  3. python setup.py sdist bdist_wheel
  4. twine upload dist/* --verbose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmprsk-1.0.0.tar.gz (22.1 kB view hashes)

Uploaded source

Built Distribution

cmprsk-1.0.0-py3-none-any.whl (21.8 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page