Skip to main content

VICC normalization routine for therapies

Project description

Therapy Normalization

Services and guidelines for normalizing drug (and non-drug therapy) terms

Developer instructions

The following sections include instructions specifically for developers.

Installation

For a development install, we recommend using Pipenv. See the pipenv docs for direction on installing pipenv in your compute environment.

Once installed, from the project root dir, just run:

pipenv sync

Deploying DynamoDB Locally

We use Amazon DynamoDB for our database. To deploy locally, follow these instructions.

Init coding style tests

Code style is managed by flake8 and checked prior to commit.

We use pre-commit to run conformance tests.

This ensures:

  • Check code style
  • Check for added large files
  • Detect AWS Credentials
  • Detect Private Key

Before first commit run:

pre-commit install

Running unit tests

Running unit tests is as easy as pytest.

pipenv run pytest

Updating the therapy normalization database

Before you use the CLI to update the database, run the following in a separate terminal to start DynamoDB on port 8000:

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb

To change the port, simply add -port value.

Setting Environment Variables

RxNorm requires a UMLS license, which you can register for one here. You must set the RxNORM_API_KEY environment variable to your API key. This can be found in the UTS 'My Profile' area after singing in.

export RXNORM_API_KEY={rxnorm_api_key}

Update source(s)

The Therapy Normalizer currently aggregates therapy data from ChEMBL, NCIt, RxNorm, DrugBank (currently using CC0 data only), ChemIDPlus, Wikidata, and HemOnc.org.

To update source(s), simply set --normalizer to the source(s) you wish to update separated by spaces. For example, the following command updates ChEMBL and Wikidata:

python3 -m therapy.cli --normalizer="chembl wikidata"

You can update all sources at once with the --update_all flag:

python3 -m therapy.cli --update_all

The data/ subdirectory within the application should include all source data. The normalizer is capable of acquiring most of these files automatically; the exception is the HemOnc.org data, which must be manually downloaded from the Harvard Dataverse and placed within the data/hemonc subdirectory. Files for all sources should follow the naming convention demonstrated below (with version numbers/dates changed where applicable).

therapy/data
├── chembl
│   └── chembl_27.db
├── chemidplus
│   └── chemidplus_20200327.xml
├── drugbank
│   └── drugbank_5.1.8.csv
├── hemonc
│   ├── hemonc_concepts_20210225.csv
│   ├── hemonc_rels_20210225.csv
│   └── hemonc_synonyms_20210225.csv
├── ncit
│   └── ncit_20.09d.owl
├── rxnorm
│   ├── drug_forms.yaml
│   └── rxnorm_20210104.RRF
└── wikidata
    └── wikidata_20210425.json

Updates to the HemOnc source depend on the Disease Normalizer service. If the Disease Normalizer database appears to be empty or incomplete, updates to HemOnc will also trigger a refresh of the Disease Normalizer database. See its README for additional data requirements.

Create Merged Concept Groups

The /normalize endpoint relies on merged concept groups. The --update_merged flag generates these groups:

python3 -m therapy.cli --update_merged

Specifying the database URL endpoint

The default URL endpoint is http://localhost:8000. There are two different ways to specify the database URL endpoint.

The first way is to set the --db_url flag to the URL endpoint.

python3 -m therapy.cli --update_all --db_url="http://localhost:8001"

The second way is to set the environment variable THERAPY_NORM_DB_URL to the URL endpoint.

export THERAPY_NORM_DB_URL="http://localhost:8001"
python3 -m therapy.cli --update_all

Starting the therapy normalization service

From the project root, run the following:

uvicorn therapy.main:app --reload

Next, view the OpenAPI docs on your local machine:

http://127.0.0.1:8000/therapy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for thera-py, version 0.2.26
Filename, size File type Python version Upload date Hashes
Filename, size thera_py-0.2.26-py3-none-any.whl (45.8 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size thera-py-0.2.26.tar.gz (36.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page