DiNetxify
Project description
About DiNetxify
DiNetxify is an open-source Python package for comprehensive three-dimensional (3D) disease network analysis of large-scale electronic health record (EHR) data. It integrates data harmonization, analysis, and visualization into a user-friendly package to uncover multimorbidity patterns and disease progression pathways. DiNetxify is optimized for efficiency (capable of handling cohorts of hundreds of thousands of patients within hours on standard hardware) and supports multiple study designs with customizable parameters and parallel computing. DiNetxify is released under GPL-3.0 license.
DiNetxify provides an end-to-end solution for 3D disease network analysis, featuring:
- Integrated Workflow: From raw EHR data to results and plots. DiNetxify guides you through data preprocessing, sequential analyses, and interactive visualizations in one coherent framework.
- Flexibility: Supports various cohort study designs, including standard cohort, matched cohort, and exposed-only cohort, and offers numerous parameters to tailor the analysis (e.g. significance thresholds, methods for network construction, etc.).
- User-Friendly API: High-level functions (e.g. a one-step pipeline) reduce coding overhead, while modular components allow fine-grained control. A dedicated data class handles data loading, cleaning, and ICD code mapping (to phecodes) automatically.
- Comprehensive Analyses: Combines phenome-wide association studies (PheWAS), comorbidity network analysis, and disease trajectory analysis to identify meaningful disease clusters and temporal sequences concurrently.
- Visualization: Built-in plotting tools generate interactive 3D network visualizations and static plots for PheWAS results, comorbidity networks, and disease trajectories, facilitating intuitive exploration of findings.
Installation and Quick Start
Installation
DiNetxify requires Python 3.10+. Install the latest release from PyPI using pip:
pip install dinetxify
This will install DiNetxify along with its dependencies. The required dependencies include: numpy, pandas, matplotlib, plotly, python_louvain, networkx, scikit_learn, scipy, statsmodels (>=0.14.4), and lifelines (optional).
Quick start
To begin using DiNetxify:
-
Install the package: Use the pip command above to install DiNetxify in your environment (Linux or Windows).
-
Initialize and load data: Import DiNetxify and create a
DiseaseNetworkDataobject with your chosen study design. Then load your cohort’s phenotype and medical records data into this object. The package will handle data validation and ICD-to-phecode mapping for you. You can download our test dummy data and run the following code:import DiNetxify as dnt # Define required columns and other covariates columns col_dict = {'Participant ID': 'ID','Exposure': 'exposure','Sex': 'sex','Index date': 'date_start','End date': 'date_end'} vars_lst = ['age', 'BMI'] # Initialize the data object with study design and phecode level data = dnt.DiseaseNetworkData(study_design="cohort",phecode_level=1,date_fmt="%Y-%m-%d") # Load the phenotype CSV file into the data object data.phenotype_data(phenotype_data_path="dummy_phenotype.csv",column_names=col_dict,covariates=vars_lst) # Merge with the first medical records file (CSV) data.merge_medical_records(medical_records_data_path="dummy_EHR_ICD9.csv",diagnosis_code="ICD-9-WHO",column_names={'Participant ID':'ID','Diagnosis code':'diag_icd9','Date of diagnosis':'dia_date'}) data.merge_medical_records(medical_records_data_path="dummy_EHR_ICD10.csv",diagnosis_code="ICD-10-WHO",column_names={'Participant ID':'ID','Diagnosis code':'diag_icd10','Date of diagnosis':'dia_date'})
-
Run the analysis: Utilize the high-level pipeline function to perform the entire 3D network analysis on your
DiseaseNetworkData:from DiNetxify import disease_network_pipeline # When using multiprocessing, ensure that the code is enclosed within the following block. # This prevents entering a never ending loop of new process creation. if __name__ == "__main__": results = disease_network_pipeline(data=data, n_process=4, n_threshold_phewas=100, n_threshold_comorbidity=100, output_dir="./results/", project_prefix="my_analysis")
Note: When using multiprocessing, multi-threading may not always close successfully, which can cause conflicts that significantly affect performance. We recommend disabling multi-threading with the following code (Linux):
export OPENBLAS_NUM_THREADS=1 export MKL_NUM_THREADS=1 export BLIS_NUM_THREADS=1 export OMP_NUM_THREADS=1 export NUMEXPR_NUM_THREADS=1or the following code in Windows:
set OPENBLAS_NUM_THREADS=1 set MKL_NUM_THREADS=1 set BLIS_NUM_THREADS=1 set OMP_NUM_THREADS=1 set NUMEXPR_NUM_THREADS=1
For a detailed tutorial on using DiNetxify, see our documentation at https://hzcohort.github.io/DiNetxify/
Citation
If you use this software in your research, please cite the following papers:
- Disease clusters and their genetic determinants following a diagnosis of depression: analyses based on a novel three-dimensional disease network approach (PMID: 40681841)
- DiNetxify: a Python package for three-dimensional disease network analysis based on electronic health record data
Contact
- Can Hou: houcan@wchscu.cn
- Haowen Liu: haowenliu81@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dinetxify-0.1.8.tar.gz.
File metadata
- Download URL: dinetxify-0.1.8.tar.gz
- Upload date:
- Size: 2.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd5784b7b6826ce4b4b6896382b3aad22c5c8c21d8e58eba39d4d6719735ce34
|
|
| MD5 |
93fb017aa26461f5623c89ac11aad163
|
|
| BLAKE2b-256 |
d82575611a13106dfb48e673b4fc36c8f5db5e47de4410220c89b587d188a8dc
|
Provenance
The following attestation bundles were made for dinetxify-0.1.8.tar.gz:
Publisher:
python-publish.yml on HZcohort/DiNetxify
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dinetxify-0.1.8.tar.gz -
Subject digest:
fd5784b7b6826ce4b4b6896382b3aad22c5c8c21d8e58eba39d4d6719735ce34 - Sigstore transparency entry: 631241253
- Sigstore integration time:
-
Permalink:
HZcohort/DiNetxify@0f3827d6bca9b13705df5a82e1aa1e1e45051ee8 -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/HZcohort
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@0f3827d6bca9b13705df5a82e1aa1e1e45051ee8 -
Trigger Event:
release
-
Statement type:
File details
Details for the file dinetxify-0.1.8-py3-none-any.whl.
File metadata
- Download URL: dinetxify-0.1.8-py3-none-any.whl
- Upload date:
- Size: 2.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8a24d7c9b4c85b6d516a1ed3e4f00fb6d5bdc6c6a11bda548c84eaa08ff41f3
|
|
| MD5 |
0253de7fbc188b4e677321f4bc1a3f47
|
|
| BLAKE2b-256 |
564c67b5448ca8f6d283ec02a87483268ec3b2475d4d807261ee0d1ff27cc751
|
Provenance
The following attestation bundles were made for dinetxify-0.1.8-py3-none-any.whl:
Publisher:
python-publish.yml on HZcohort/DiNetxify
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dinetxify-0.1.8-py3-none-any.whl -
Subject digest:
e8a24d7c9b4c85b6d516a1ed3e4f00fb6d5bdc6c6a11bda548c84eaa08ff41f3 - Sigstore transparency entry: 631241258
- Sigstore integration time:
-
Permalink:
HZcohort/DiNetxify@0f3827d6bca9b13705df5a82e1aa1e1e45051ee8 -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/HZcohort
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@0f3827d6bca9b13705df5a82e1aa1e1e45051ee8 -
Trigger Event:
release
-
Statement type: