Skip to main content

VICC normalization routine for therapies

Project description

Therapy Normalization

Services and guidelines for normalizing drug (and non-drug therapy) terms

Developer instructions

The following sections include instructions specifically for developers.

Installation

For a development install, we recommend using Pipenv. See the pipenv docs for direction on installing pipenv in your compute environment.

Once installed, from the project root dir, just run:

pipenv sync

Deploying DynamoDB Locally

We use Amazon DynamoDB for our database. To deploy locally, follow these instructions.

Init coding style tests

Code style is managed by flake8 and checked prior to commit.

We use pre-commit to run conformance tests.

This ensures:

  • Check code style
  • Check for added large files
  • Detect AWS Credentials
  • Detect Private Key

Before first commit run:

pre-commit install

Running unit tests

Running unit tests is as easy as pytest.

pipenv run pytest

Updating the therapy normalization database

Before you use the CLI to update the database, run the following in a separate terminal to start DynamoDB on port 8000:

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb

To change the port, simply add -port value.

Setting Environment Variables

RxNorm requires a UMLS license, which you can register for one here. You must set the RxNORM_API_KEY environment variable to your API key. This can be found in the UTS 'My Profile' area after singing in.

export RXNORM_API_KEY={rxnorm_api_key}

Update source(s)

The sources we currently use are: ChEMBL, NCIt, DrugBank (CC0 data only), RxNorm, ChemIDplus, Wikidata, and HemOnc.org.

To update source(s), simply set --normalizer to the source(s) you wish to update separated by spaces. For example, the following command updates ChEMBL and Wikidata:

python3 -m therapy.cli --normalizer="chembl wikidata"

You can update all sources at once with the --update_all flag:

python3 -m therapy.cli --update_all

The data/ subdirectory within the application should include all source data. The normalizer is capable of acquiring most of these files automatically; the exception is the HemOnc.org data, which must be manually downloaded from the Harvard Dataverse and placed within the data/hemonc subdirectory. Files for all sources should follow the naming convention demonstrated below (with version numbers/dates changed where applicable).

therapy/data
├── chembl
│   └── chembl_27.db
├── chemidplus
│   └── chemidplus_20200327.xml
├── drugbank
│   └── drugbank_5.1.8.csv
├── hemonc
│   ├── hemonc_concepts_20210225.csv
│   ├── hemonc_rels_20210225.csv
│   └── hemonc_synonyms_20210225.csv
├── ncit
│   └── ncit_20.09d.owl
├── rxnorm
│   ├── drug_forms.yaml
│   └── rxnorm_20210104.RRF
└── wikidata
    └── wikidata_20210425.json

Create Merged Concept Groups

The /normalize endpoint relies on merged concept groups. The --update_merged flag generates these groups:

python3 -m therapy.cli --update_merged

Specifying the database URL endpoint

The default URL endpoint is http://localhost:8000. There are two different ways to specify the database URL endpoint.

The first way is to set the --db_url flag to the URL endpoint.

python3 -m therapy.cli --update_all --db_url="http://localhost:8001"

The second way is to set the THERAPY_NORM_DB_URL to the URL endpoint.

export THERAPY_NORM_DB_URL="http://localhost:8001"
python3 -m therapy.cli --update_all

Starting the therapy normalization service

From the project root, run the following:

uvicorn therapy.main:app --reload

Next, view the OpenAPI docs on your local machine:

http://127.0.0.1:8000/therapy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thera-py-0.2.20.tar.gz (38.4 kB view hashes)

Uploaded Source

Built Distribution

thera_py-0.2.20-py3-none-any.whl (48.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page