Skip to main content

An ontology-based rare disease common data model (RD-CDM) harmonising international registries, HL7 FHIR, and GA4GH Phenopackets.

Project description

ontology-based rare disease common data model

Welcome to the repo of the ontology-based rare disease common data model (RD-CDM) harmonising international registry use, HL7® FHIR®, and the GA4GH Phenopacket Schema.

CI Documentation Status DOI Python Versions PyPI Downloads LinkML

Latest docs: https://rd-cdm.readthedocs.io/en/latest/

Manuscript

The corresponding paper for RD-CDM v2.0.0 has been published in Nature Scientific Data:
https://www.nature.com/articles/s41597-025-04558-z


Table of Contents


Project Description

The ontology-based RD-CDM harmonizes rare disease data capture across registries. It integrates ERDRI-CDS, HL7 FHIR, and GA4GH Phenopacket Schema to support interoperable data for research and care. RD-CDM v2.0.x comprises 78 data elements covering formal criteria, personal information, patient status, disease, genetic findings, phenotypic findings, and family history.


What you get from PyPI

Installing rd-cdm from PyPI provides:

  • Schema

    • src/rd_cdm/schema/rd_cdm.yaml — LinkML schema defining the data model structure. The version and date fields here are the single source of truth for the data model version.
  • Instances

    • src/rd_cdm/instances/code_systems.yaml
    • src/rd_cdm/instances/data_elements.yaml
    • src/rd_cdm/instances/value_sets.yaml
    • src/rd_cdm/instances/rd_cdm.yaml — merged instance, version-stamped with rd_cdm_version and rd_cdm_date at the top
    • src/rd_cdm/instances/jsons/rd_cdm.json
    • src/rd_cdm/instances/csvs/rd_cdm.csv
  • Generated Python classes (LinkML)

    • src/rd_cdm/python_classes/rd_cdm.py — LinkML runtime dataclasses
    • src/rd_cdm/python_classes/rd_cdm_pydantic.py — Pydantic v2 models
  • CLI entry points

    • rd-cdm-merge — merge instance parts into rd_cdm.yaml
    • rd-cdm-json — export to jsons/rd_cdm.json
    • rd-cdm-csv — export to csvs/rd_cdm.csv
    • rd-cdm-validate — validate ontology codes via BioPortal

Features

  • Interoperability: Aligns with HL7 FHIR v4.0.1 and GA4GH Phenopacket v2.0
  • Ontology-driven: Uses SNOMED CT, LOINC, NCIT, MONDO, OMIM, HPO, and more
  • Modular: Clear separation of schema, instances, and exports
  • Self-describing exports: Every YAML, JSON, and CSV file carries rd_cdm_version and rd_cdm_date so outputs are unambiguous without needing to know which package version was installed
  • Tooling: Merge, export, and validation utilities via simple CLI commands
  • Pydantic models: Runtime validation generated from the LinkML schema

Installation

From PyPI:

pip install rd-cdm

Optional extras:

pip install rd-cdm[test]   # pytest, requests-mock
pip install rd-cdm[docs]   # sphinx, sphinx-rtd-theme, sphinx-copybutton
pip install rd-cdm[dev]    # linkml (for regenerating Python classes)

Development install

git clone https://github.com/BIH-CEI/rd-cdm.git
cd rd-cdm
python -m venv .venv && source .venv/bin/activate
pip install -U pip
pip install -e ".[test,dev]"
pytest -q

We use a src/ layout. Use the installed CLI entry points shown below rather than running scripts directly.


CLI tools

After installation the following commands are available:

# Merge instance parts → src/rd_cdm/instances/rd_cdm.yaml
rd-cdm-merge

# Export merged YAML → src/rd_cdm/instances/jsons/rd_cdm.json
rd-cdm-json

# Export merged YAML → src/rd_cdm/instances/csvs/rd_cdm.csv
rd-cdm-csv

# Validate all ontology codes against BioPortal
rd-cdm-validate

The recommended order when updating the model is:

rd-cdm-merge && rd-cdm-json && rd-cdm-csv && rd-cdm-validate

Validating with BioPortal

rd-cdm-validate checks all ontology codes in the merged instance against BioPortal and reports version drift and missing or deprecated terms.

Get an API key

Sign up at https://bioportal.bioontology.org/accounts/new, go to your account settings, and copy your API key.

Set the environment variable

macOS / Linux:

export BIOPORTAL_API_KEY="your-key-here"

Windows (PowerShell):

setx BIOPORTAL_API_KEY "your-key-here"

Contributing and Contact

The RD-CDM is a community-driven effort. Please feel free to open issues, discuss features, or submit pull requests. For larger contributions consider reaching out directly.

See the Contributing section of our documentation for full guidelines.

RareLink

RareLink is a REDCap-based framework for rare disease research linking international registries to FHIR and Phenopackets, built on the RD-CDM.


Resources

Ontologies

  • Human Phenotype Ontology 🔗
  • Monarch Initiative Disease Ontology 🔗
  • Online Mendelian Inheritance in Man 🔗
  • Orphanet Rare Disease Ontology 🔗
  • SNOMED CT 🔗
  • ICD-11 🔗
  • ICD-10-CM 🔗
  • NCBI Taxonomy 🔗
  • LOINC 🔗
  • HGNC 🔗
  • NCI Thesaurus OBO Edition 🔗

For the ontology versions used in each RD-CDM release, see the resources page in our documentation.


License

MIT License

Citing

Graefe, A.S.L., Hübner, M.R., Rehburg, F. et al. An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets. Sci Data 12, 234 (2025). https://doi.org/10.1038/s41597-025-04558-z

Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rd_cdm-2.0.3.tar.gz (69.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rd_cdm-2.0.3-py3-none-any.whl (90.2 kB view details)

Uploaded Python 3

File details

Details for the file rd_cdm-2.0.3.tar.gz.

File metadata

  • Download URL: rd_cdm-2.0.3.tar.gz
  • Upload date:
  • Size: 69.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.13 Darwin/24.6.0

File hashes

Hashes for rd_cdm-2.0.3.tar.gz
Algorithm Hash digest
SHA256 3864118d22c593fd0908b518292b8d86a16064981b4ed1a9d4796ee9d0b40218
MD5 61420ae01799a0e18216422fb5c7ec20
BLAKE2b-256 49ae5b55587bce655013e3ac2a5ed4a414fd2190a1bca6af8dba7a82873df2c6

See more details on using hashes here.

File details

Details for the file rd_cdm-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: rd_cdm-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 90.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.13 Darwin/24.6.0

File hashes

Hashes for rd_cdm-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f4bc05617d7f718b7dabfcce9e70840d74b9740c461ce4c49c039d91a15f90f1
MD5 83f215365269a06350e951dde41380df
BLAKE2b-256 6306a39efab65832f04048b9ed4ddb6845bb8c6c62a742745f0f298a8c00544c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page