Skip to main content

Mapping GDC's and Cellosaurus schema to FHIR schema.

Project description

fhirizer

Status License: MIT

mapping

Project overview:

Transforms and harmonizes data from Genomic Data Commons (GDC), Cellosaurus cell-lines, and International Cancer Genome Consortium (ICGC) repositories into 🔥 FHIR (Fast Healthcare Interoperability Resources) format.

  • GDC study simplified FHIR graph

mapping

Usage

Installation

  • from source
git clone repo
cd fhirizer
# create virtual env ex. 
# NOTE: package_data folders must be in python path in virtual envs 
python -m venv venv-fhirizer
source venv-fhirizer/bin/activate
pip install . 
  • Dockerfile
(sudo) docker build -t <tag-name>:latest .
(sudo) docker run -it  --mount type=bind,source=<path-to-input-ndjson>,target=/opt/data --rm <tag-name>:latest
  • Singularity
singularity build fhirizer.sif docker://quay.io/ohsu-comp-bio/fhirizer
singularity shell fhirizer.sif

Convert and Generate

Detailed step-by-step guide on FHIRizing data for a project's study can be found in the project's directory overview.

  • GDC

    • convert GDC schema keys to fhir mapping

    • generate fhir object models ndjson files in directory

      Example run for patient - replace path's to ndjson files or directories.

    fhirizer convert --name case --in_path ./projects/<my-project>/cases.ndjson --out_path ./projects/<my-project>/cases_key.ndjson --verbose True
    
    fhirizer generate --name case --out_dir ./projects/<my-project>/META --entity_path ./projects/<my-project>/cases_key.ndjson
    
    • to generate document reference for the patients
    fhirizer convert --name file --in_path ./projects/<my-project>/files.ndjson --out_path ./projects/<my-project>/files_key.ndjson --verbose True
    
    fhirizer generate --name file --out_dir ./projects/<my-project>/META --entity_path ./projects/<my-project>/files_key.ndjson
    
  • Cellosaurus

     fhirizer generate --name cellosaurus --out_dir ./projects/<my-project>/META --entity_path ./projects/<my-project>/<cellosaurus-celllines-ndjson>
    
  • ICGC

    • NOTE: Active site and data dictionary updates from ICGC DCC to ICGC ARGO is in progress.
     fhirizer generate --name icgc --icgc <ICGC_project_name> --has_files
    

Constructing GDC maps cli cmds

initialize initial structure of project, case, or file to add Maps

fhirizer project_init 
# to update Mappings run associated labels script ex ./labels/project.py 

fhirizer case_init 
fhirizer file_init 

Testing

pytest -cov 

fhirizer structure:

Data directories included in package data:

  • resources: data resources generated or used in mappings
  • mapping: json data maps produced by fhirizer pydantic schema maps

fhirizer/
|-- fhirizer/
|   |-- __init__.py
|   |-- labels/
|   |   |-- __init__.py
|   |   |-- files.py
|   |   |-- case.py
|   |   └── project.py
|   |   
|   |-- schema.py
|   |-- entity2fhir.py
|   |-- mapping.py
|   |-- utils.py
|   └── cli.py
|   
|-- mapping/
|   |-- project.json
|   |-- case.json
|   └── file.json
|  
|-- resources/
|   |-- gdc_resources/
|   |   |-- content_annotations/
|   |   |-- data_dictionary/
|   |   └── fields/
|   └── fhir_resources/
| 
|-- tests/
|   |-- __init__.py
|   |-- unit/
|   |   |-- __init__.py
|   |   └── test_mapping.py
|   |-- integration/
|   |   |-- __init__.py
|   |   |-- test_generate.py
|   |   └── test_convert.py
|   └── fixtures/
| 
|-- projects/
|   └── GDC/ 
|   |     └── TCGA-STUDY/
|   |           |-- cases.ndjson
|   |           |-- filess.ndjson
|   |           └── META/
|   └── ICGC/
|         └── ICGC-STUDY/ 
|                |-- data/
|                └── META/
|--README.md
└── setup.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fhirizer-2.0.0-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file fhirizer-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: fhirizer-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for fhirizer-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1c422c93f6de5071348186c54fa34fea7d7ff77237f38cbc002368c27946ffec
MD5 3bc7f4aa3cc96892db048f80a59dec0c
BLAKE2b-256 9f7fbd94213829520e8f0809d995b36ad41fdf931e41061d6ccdb78295621a53

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page