An ontology-based rare disease common data model (RD-CDM) harmonising international registries, HL7 FHIR, and GA4GH Phenopackets.
Project description
ontology-based rare disease common data model
Welcome to the repo of the ontology-based rare disease common data model (RD-CDM) harmonising international registry use, HL7® FHIR®, and the GA4GH Phenopacket Schema.
Latest docs: https://rd-cdm.readthedocs.io/en/latest/
Manuscript
The corresponding paper for RD-CDM v2.0.0 has been published in Nature Scientific Data:
https://www.nature.com/articles/s41597-025-04558-z
Table of Contents
- Project Description
- What you get from PyPI
- Features
- Installation
- CLI tools
- Validating with BioPortal
- Contributing & Contact
- Resources
- License
- Citing
- Acknowledgements
Project Description
The ontology-based RD-CDM harmonizes rare disease data capture across registries. It integrates ERDRI-CDS, HL7 FHIR, and GA4GH Phenopacket Schema to support interoperable data for research and care. RD-CDM v2.0.x comprises 78 data elements covering formal criteria, personal information, patient status, disease, genetic findings, phenotypic findings, and family history.
What you get from PyPI
Installing rd-cdm from PyPI provides:
-
Schema
src/rd_cdm/schema/rd_cdm.yaml— LinkML schema defining the data model structure. Theversionanddatefields here are the single source of truth for the data model version.
-
Instances
src/rd_cdm/instances/code_systems.yamlsrc/rd_cdm/instances/data_elements.yamlsrc/rd_cdm/instances/value_sets.yamlsrc/rd_cdm/instances/rd_cdm.yaml— merged instance, version-stamped withrd_cdm_versionandrd_cdm_dateat the topsrc/rd_cdm/instances/jsons/rd_cdm.jsonsrc/rd_cdm/instances/csvs/rd_cdm.csv
-
Generated Python classes (LinkML)
src/rd_cdm/python_classes/rd_cdm.py— LinkML runtime dataclassessrc/rd_cdm/python_classes/rd_cdm_pydantic.py— Pydantic v2 models
-
CLI entry points
rd-cdm-merge— merge instance parts intord_cdm.yamlrd-cdm-json— export tojsons/rd_cdm.jsonrd-cdm-csv— export tocsvs/rd_cdm.csvrd-cdm-validate— validate ontology codes via BioPortal
Features
- Interoperability: Aligns with HL7 FHIR v4.0.1 and GA4GH Phenopacket v2.0
- Ontology-driven: Uses SNOMED CT, LOINC, NCIT, MONDO, OMIM, HPO, and more
- Modular: Clear separation of schema, instances, and exports
- Self-describing exports: Every YAML, JSON, and CSV file carries
rd_cdm_versionandrd_cdm_dateso outputs are unambiguous without needing to know which package version was installed - Tooling: Merge, export, and validation utilities via simple CLI commands
- Pydantic models: Runtime validation generated from the LinkML schema
Installation
From PyPI:
pip install rd-cdm
Optional extras:
pip install rd-cdm[test] # pytest, requests-mock
pip install rd-cdm[docs] # sphinx, sphinx-rtd-theme, sphinx-copybutton
pip install rd-cdm[dev] # linkml (for regenerating Python classes)
Development install
git clone https://github.com/BIH-CEI/rd-cdm.git
cd rd-cdm
python -m venv .venv && source .venv/bin/activate
pip install -U pip
pip install -e ".[test,dev]"
pytest -q
We use a src/ layout. Use the installed CLI entry points shown below rather than running scripts directly.
CLI tools
After installation the following commands are available:
# Merge instance parts → src/rd_cdm/instances/rd_cdm.yaml
rd-cdm-merge
# Export merged YAML → src/rd_cdm/instances/jsons/rd_cdm.json
rd-cdm-json
# Export merged YAML → src/rd_cdm/instances/csvs/rd_cdm.csv
rd-cdm-csv
# Validate all ontology codes against BioPortal
rd-cdm-validate
The recommended order when updating the model is:
rd-cdm-merge && rd-cdm-json && rd-cdm-csv && rd-cdm-validate
Validating with BioPortal
rd-cdm-validate checks all ontology codes in the merged instance against
BioPortal and reports version drift and missing or deprecated terms.
Get an API key
Sign up at https://bioportal.bioontology.org/accounts/new, go to your account settings, and copy your API key.
Set the environment variable
macOS / Linux:
export BIOPORTAL_API_KEY="your-key-here"
Windows (PowerShell):
setx BIOPORTAL_API_KEY "your-key-here"
Contributing and Contact
The RD-CDM is a community-driven effort. Please feel free to open issues, discuss features, or submit pull requests. For larger contributions consider reaching out directly.
See the Contributing section of our documentation
for full guidelines.
RareLink
RareLink is a REDCap-based framework for rare disease research linking international registries to FHIR and Phenopackets, built on the RD-CDM.
Resources
Ontologies
- Human Phenotype Ontology 🔗
- Monarch Initiative Disease Ontology 🔗
- Online Mendelian Inheritance in Man 🔗
- Orphanet Rare Disease Ontology 🔗
- SNOMED CT 🔗
- ICD-11 🔗
- ICD-10-CM 🔗
- NCBI Taxonomy 🔗
- LOINC 🔗
- HGNC 🔗
- NCI Thesaurus OBO Edition 🔗
For the ontology versions used in each RD-CDM release, see the resources page in our documentation.
License
Citing
Graefe, A.S.L., Hübner, M.R., Rehburg, F. et al. An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets. Sci Data 12, 234 (2025). https://doi.org/10.1038/s41597-025-04558-z
Acknowledgements
- Adam SL Graefe
- Filip Rehburg
- Samer Alkarkoukly
- Daniel Danis
- Peter N. Robinson
- Oya Beyan
- Sylvia Thun
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rd_cdm-2.0.3.tar.gz.
File metadata
- Download URL: rd_cdm-2.0.3.tar.gz
- Upload date:
- Size: 69.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.13 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3864118d22c593fd0908b518292b8d86a16064981b4ed1a9d4796ee9d0b40218
|
|
| MD5 |
61420ae01799a0e18216422fb5c7ec20
|
|
| BLAKE2b-256 |
49ae5b55587bce655013e3ac2a5ed4a414fd2190a1bca6af8dba7a82873df2c6
|
File details
Details for the file rd_cdm-2.0.3-py3-none-any.whl.
File metadata
- Download URL: rd_cdm-2.0.3-py3-none-any.whl
- Upload date:
- Size: 90.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.13 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4bc05617d7f718b7dabfcce9e70840d74b9740c461ce4c49c039d91a15f90f1
|
|
| MD5 |
83f215365269a06350e951dde41380df
|
|
| BLAKE2b-256 |
6306a39efab65832f04048b9ed4ddb6845bb8c6c62a742745f0f298a8c00544c
|