Skip to main content

Data slicing tool for reading data from one tranSMART and uploading it to another

Project description

Build status codecov PyPI PyPI - Downloads

transmart-hyper-dicer is a data slicing tool that reads data from one TranSMART instance and uploads it to another.

⚠️ Note: this is a very preliminary version, still under development. Issues can be reported at https://github.com/thehyve/transmart-hyper-dicer/issues.

Configuration

Connection to Keycloak identity provider and tranSMART is configured by setting the environment variables below:

Variable

Description

TRANSMART_URL

URL of the TranSMART back-end application e.g. https://transmart.example.com

KEYCLOAK_SERVER_URL

URL of the Keycloak identity provider e.g. https://keycloak.example.com

KEYCLOAK_REALM

Keycloak realm, e.g. dev

KEYCLOAK_CLIENT_ID

Keycloak client ID, e.g. transmart-client

OFFLINE_TOKEN

An offline token used used as a refresh token in order to communicate with TranSMART

VERIFY_CERT

Either a boolean, in which case it controls whether the server’s TLS certificate is verified, or a string, in which case it must be a path to a CA bundle to use. Defaults to True.

In order to generate an offline token for USERNAME user, the following curl command can be used. To get the token the user needs to have the role mapping for the realm-level: offline_access. Before using the command you have to substitute words in uppercase with proper ones.

curl \
  -d 'client_id=KEYCLOAK_CLIENT_ID' \
  -d 'username=USERNAME' \
  -d 'password=PASSWORD' \
  -d 'grant_type=password' \
  -d 'scope=offline_access' \
  'https://KEYCLOAK_SERVER_URL/auth/realms/KEYCLOAK_REALM/protocol/openid-connect/token'

The value of the refresh_token field in the response is the offline token.

All the variables can be specified in the .env file as key-value pairs. They will be automatically set as environment variables, when starting the application. Example of the .env file:

KEYCLOAK_CLIENT_ID=transmart-client
KEYCLOAK_SERVER_URL=https://keycloak.example.com
KEYCLOAK_REALM=dev
OFFLINE_TOKEN=<refresh_token value from the curl response>
TRANSMART_URL=https://transmart.example.com

Installation

The package requires Python 3.6+.

To install transmart-hyper-dicer, do:

pip install transmart-hyper-dicer

Or from source:

git clone https://github.com/thehyve/transmart-hyper-dicer.git
cd transmart-hyper-dicer
pip install .

Run tests (including coverage) with:

python setup.py test

Usage

Read subset of data from the configured tranSMART instance, based on the constraint specified in an input JSON file and write the output in transmart-copy format to /path/to/output. The output directory should be empty of not existing (then it will be created).

Input constraint has to be a valid tranSMART constraint. Example of <input.json> file content:

{
  "type": "study_name",
  "studyId": "EHR"
}

Run:

transmart-hyper-dicer <input.json> /path/to/output

This generates the directories i2b2metadata and i2b2demodata in the output directory. The generated data can be loaded using transmart-copy:

# Download transmart-copy:
curl -f -L https://repo.thehyve.nl/service/local/repositories/releases/content/org/transmartproject/transmart-copy/17.1-HYVE-6.2/transmart-copy-17.1-HYVE-6.2.jar -o transmart-copy.jar
# Load data
PGUSER=tm_cz PGPASSWORD=tm_cz java -jar transmart-copy.jar -d output

Package management and dependencies

This project uses pip for installing dependencies and package management.

  • Dependencies should be added to setup.py in the install_requires list.

License

Copyright (c) 2019 The Hyve B.V.

The Transmart Hyper Dicer is licensed under the MIT License. See the file LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transmart-hyper-dicer-0.1.3.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

transmart_hyper_dicer-0.1.3-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file transmart-hyper-dicer-0.1.3.tar.gz.

File metadata

  • Download URL: transmart-hyper-dicer-0.1.3.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for transmart-hyper-dicer-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0babfa11af43f15e4a06d799f5fea30906dcb70aa763ad0b38928a879c2a1767
MD5 88bfc4f0ff9e06a9de2e34e7adbf081b
BLAKE2b-256 8fe2fef5c7502a50ee7b78f2314300332c723877e6d948240c41a300242bbacd

See more details on using hashes here.

File details

Details for the file transmart_hyper_dicer-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: transmart_hyper_dicer-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for transmart_hyper_dicer-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4b187338e93b2343e8323f479c28cdfeb05c75a7eabf1efb16efe83d1c0746be
MD5 83db15ae2278d724bb85791b637f1a01
BLAKE2b-256 d5234c4fcbf8b4ff58bffb4890e8ccf1809452fcf50124e851e449b7585e2c7f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page