Skip to main content

Python tools for interacting with the UMLS

Project description

UMLS tools

version: 0.3.3b1

The umls-tools package is a toolkit to build the UMLS data into a relational database. It also provides an SQLAlchemy object relational mapper and an API for using Metamap. In addition, there are scripts to extract the relationships and build them into a Neo4j graph database.

There is online documentation for umls-tools.

Please note that the code in this package is intended for research use only and not meant for any clinical use.

Installation instructions

At present, umls-tools is undergoing development and no packages exist yet on PyPi. Therefore it is recommended that you install in either of the two ways listed below.

Installation using conda

I maintain a conda package in my personal conda channel. To install from this please run:

conda install -c cfin -c bioconda -c conda-forge umls-tools

There are currently builds for Python v3.8, v3.9 and v3.10 for Linux-64 and Mac-osx. Please keep in mind that all development is carried out on Linux-64 and Python v3.8/v3.9. I do not own a Mac so can't test on one, the conda build does run some import tests but that is it.

Installation using pip

You can install using pip from the root of the cloned repository, first clone and cd into the repository root:

git clone git@gitlab.com:cfinan/umls-tools.git
cd umls-tools

Install the dependencies:

python -m pip install --upgrade -r requirements.txt

Then install using pip

python -m pip install .

Or for an editable (developer) install run the command below from the root of the repository. The difference with this is that you can just to a git pull to update, or switch branches without re-installing:

python -m pip install -e .

Conda dependencies

There are also conda yaml environment files in ./resources/conda/envs that have the same contents as requirements.txt but for conda packages, so all the pre-requisites. I use this to install all the requirements via conda and then install the package as an editable pip install.

However, if you find these useful then please use them. There are Conda environments for Python v3.8, v3.9 and v3.10.

Next steps...

You might want to setup a database connection config file if you are using any RDMS other than SQLite.

You will also want to install a copy of the UMLS database.

Although the umls_tools.parse module is deprecated, it does require the GeniaTagger to be installed. The path to the binary should be set in an environment variable called GENIATAGGER in your ~/.bashrc. If you do not plan to use the umls_tools.parse module then this is optional.

If you plan to use Metamap, you will also need to install it locally, you will need to login to the NLM for that.

There is also an experimental Neo4j build script you can try but read below first.

In addition to the Python command-line scripts that are available when the package is installed. There are also some bash administrative scripts located in ./resources/bin. Please note these will not be installed when you install via clone & pip or a conda install. If using conda you will have to clone the repo. With either install method you will need to add the ./resources/bin directory to your PATH.

These scripts will require two bash libraries to be in your PATH.

  1. shflags - This <https://github.com/kward/shflags>_ is to manage bash command line arguments.
  2. bash-helpers - This <https://gitlab.com/cfinan/bash-helpers>_ wraps some handle bash functions.

For more information on what is available see the bash script documentation.

Change log

version 0.3.0a0

  • API - Add a generalisable index module (umls_tools.admin.index) for creating index tables in the UMLS and other databases. This also offers some basic index search options.
  • API - Added an umls_tools.orm_mixin module, to generalise index table creation.
  • API - Updated the ORM to add index tables to the UMLS schema.
  • API - Deprecated the umls_tools.parsers module.
  • SCRIPTS - Added a UMLS database index script to create index tables from the MRCONSO.STR fields.

version 0.3.1a0

  • API - Updated to use SQLALchemy 2 - This will cause some warnings when the ORM module is loaded. I am currently investigating this.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cfin_umls_tools-0.3.3b1.tar.gz (112.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cfin_umls_tools-0.3.3b1-py3-none-any.whl (126.3 kB view details)

Uploaded Python 3

File details

Details for the file cfin_umls_tools-0.3.3b1.tar.gz.

File metadata

  • Download URL: cfin_umls_tools-0.3.3b1.tar.gz
  • Upload date:
  • Size: 112.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for cfin_umls_tools-0.3.3b1.tar.gz
Algorithm Hash digest
SHA256 ffca665f96d46166d77a83594a037c7cbc69b64fd9a796e9d438b160940a87f8
MD5 e987325dc7342e165db2fff445c20874
BLAKE2b-256 273e9d76d00995ba28debc4c92cced8e254bc8b551a09a111c3581c89061bb9b

See more details on using hashes here.

File details

Details for the file cfin_umls_tools-0.3.3b1-py3-none-any.whl.

File metadata

File hashes

Hashes for cfin_umls_tools-0.3.3b1-py3-none-any.whl
Algorithm Hash digest
SHA256 c2b2661d83499b7a96caaca1aa7cf456481c774f646213fe574c991816f247d3
MD5 884255f0cd74e8ba74ad09546149f8d0
BLAKE2b-256 61725a729883b29fa453aac35114acd5149545af9e2af5991d67845823124655

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page