Skip to main content

Automate downloading UMLS data.

Project description

UMLS Downloader

Tests PyPI PyPI - Python Version PyPI - License Code style: black

Don't worry about UMLS licensing and distribution rules - just use umls_downloader to write code that knows how to download it and use it automatically.

Installation

$ pip install umls_downloader

Download A Specific Version

import os
from umls_downloader import download_umls

# Get this from https://uts.nlm.nih.gov/uts/edit-profile
api_key = ...

path = download_umls(version="2021AB", api_key=api_key)

# This is where it gets downloaded: ~/.data/bio/umls/2021AB/umls-2021AB-mrconso.zip
expected_path = os.path.join(
    os.path.expanduser("~"), ".data", "umls", "2021AB",
    "umls-2021AB-mrconso.zip",
)
assert expected_path == path.as_posix()

After it's been downloaded once, it's smart and doesn't need to download again. It gets stored using pystow automatically in the ~/.data/umls directory.

Automating Configuration of UMLS Credentials

There are two ways to automatically set the username and password so you don't have to worry about getting it and passing it around in your python code:

  1. Set UMLS_API_KEY in the environment
  2. Create ~/.config/umls.ini and set in the [umls] section a api_key key.
from umls_downloader import download_umls

# Same path as before
path = download_umls(version="2021AB")

Download the Latest Version

First, you'll have to install bioversions with pip install bioversions, whose job it is to look up the latest version of many databases. Then, you can modify the previous code slightly by omitting the version keyword argument:

from umls_downloader import download_umls

# Same path as before (as of November 21st, 2021)
path = download_umls()

Download and open the file

The UMLS file is zipped, so it's usually accompanied with the following boilerplate code:

import zipfile
from umls_downloader import download_umls
path = download_umls()
with zipfile.ZipFile(path) as zip_file:
    with zip_file.open("MRCONSO.RRF", mode="r") as file:
        for line in file:
            ...

This exact code is wrapped with the open_umls() using Python's context manager so it can more simply be written as:

from umls_downloader import open_umls

with open_umls() as file:
    for line in file:
        ...

The version and api_key arguments also apply here.

Why not an API?

The UMLS provides an API for access to tiny bits of data at a time. There are even two recent (last 5 years) packages umls-api connect-umls that provide a wrapper around them. However, API access is generally rate limited, difficult to use in bulk, and slow. For working with UMLS (or any other database, for that matter)in bulk, it's necessary to download full database dumps.

👋 Attribution

⚖️ License

The code in this package is licensed under the MIT License.

🍪 Cookiecutter

This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

umls_downloader-0.0.3.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

umls_downloader-0.0.3-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file umls_downloader-0.0.3.tar.gz.

File metadata

  • Download URL: umls_downloader-0.0.3.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for umls_downloader-0.0.3.tar.gz
Algorithm Hash digest
SHA256 97836c3a10dd83aa53c963eab237bad22b6b681f5deaa1191be65086f61cc910
MD5 3b752f186533a2d8b6eb304a14d65a96
BLAKE2b-256 cf8a3ea2cdce955a1dd41e8965d4d021c04cbddefa30e350ecf3e25627d1d3d3

See more details on using hashes here.

Provenance

File details

Details for the file umls_downloader-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: umls_downloader-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for umls_downloader-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5c4b59bbd06847f830348b0c405d54e4d1e484a8efc22f394885025ab0cc4997
MD5 8b2acd97419646c53f70809015bcb2fe
BLAKE2b-256 b18a0b617b8b70edce3955ca8782b909aa9fd9a89c392f4e26bdaaffc04fe8d2

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page