Skip to main content

This package computes a variety of similarity metrics between concepts present in the UMLS database

Project description

Overview

This package computes a variety of similarity metrics between concepts present in the UMLS (Unified Medical Language System) database. It serves as a Python wrapper based off the Perl module developed by Bridget McInnes and Ted Pedersen, offering an accessible and user-friendly interface for Python users.

Installation

To install PyUMLS_Similarity, run the following command:

pip install pyumls_similarity

Prerequisites

Before using the PyUMLS_Similarity package, ensure that you have the following prerequisites installed and set up:

Strawberry Perl

The package requires Strawberry Perl to run Perl scripts. Download and install it from Strawberry Perl's official website.

MySQL

A local MySQL database instance is required to store and access UMLS data. Download and install MySQL from MySQL's official download page.

UMLS Data

You need to have a local instance of the UMLS installed in MySQL. This involves downloading UMLS data and importing it into your MySQL database. Follow the guidelines provided by the UMLS for obtaining a license and downloading the UMLS data.

UMLS-Interface and UMLS-Similarity Perl Modules

The package depends on the UMLS-Interface and UMLS-Similarity Perl modules. After installing Strawberry Perl, install these modules using CPAN:

cpanm UMLS::Interface
cpanm UMLS::Similarity

Usage

Below are some examples of how to use the PyUMLS_Similarity package.

Start by initiating an instance of the PyUMLS_Similarity class:

from pyumls_similarity import PyUMLS_Similarity

mysql_info = {
    "username": "your_username",
    "password": "your_password",
    "hostname": "localhost",
    "socket": "your_socket",
    "database": "umls"
}

umls_sim = PyUMLS_Similarity(mysql_info=mysql_info)

Computing Multiple Similarity Metrics

You can compute similarity metrics between UMLS concepts as shown below:

cui_pairs = [
    ('C0018563', 'C0037303'),
    ('C0035078', 'C0035078'),
]

# Compute similarity using specific measures
measures = ['lch', 'wup']
similarity_df = umls_sim.similarity(cui_pairs, measures)

An example output would look something like this:

Term 1 Term 2 CUI 1 CUI 2 lch wup
0 hand skull C0018563 C0037303 0.500 0.700
1 Renal failure Kidney failure C0035078 C0035078 1.000 1.000
2 Heart Myocardium C0018787 C0027061 0.823 0.875

Finding Shortest Path

To find the shortest path between concepts:

shortest_path_df = umls_sim.find_shortest_path(cui_pairs)

Finding Least Common Subsumer

To find the least common subsumer (LCS) of concepts:

lcs_df = umls_sim.find_least_common_subsumer(cui_pairs)

Concurrency

PyUMLS_Similarity also supports running tasks concurrently for efficiency:

tasks = [
    {'function': 'similarity', 'arguments': (cui_pairs, measures)},
    {'function': 'shortest_path', 'arguments': (cui_pairs)},
    {'function': 'lcs', 'arguments': (cui_pairs)}
]

results = umls_sim.run_concurrently(tasks)

Acknowledgements

This package is based on the Perl module developed by Bridget McInnes and Ted Pedersen.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyUMLS_Similarity-0.0.8.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

PyUMLS_Similarity-0.0.8-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file PyUMLS_Similarity-0.0.8.tar.gz.

File metadata

  • Download URL: PyUMLS_Similarity-0.0.8.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for PyUMLS_Similarity-0.0.8.tar.gz
Algorithm Hash digest
SHA256 de239eb58d16e75abcd93a78853d30810670b9a7ff4a3a084c8506cbe0e08531
MD5 f5146992c9b13567b1cfc9f57f1b12c5
BLAKE2b-256 1c5083e7beb41f0f050733d31e3b7b4cc4441a07578dac98db0d7cadb51973ef

See more details on using hashes here.

File details

Details for the file PyUMLS_Similarity-0.0.8-py3-none-any.whl.

File metadata

File hashes

Hashes for PyUMLS_Similarity-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 3d8864b66a481fcd3e0ee27f6097437972e8654d205cf871a7ac7df057353214
MD5 143ffdcd498c7f6e32f4ec7492b60050
BLAKE2b-256 3774309c9b93c7d2065d4ece1c7541d6b272e32a7e0264f5992819caf80c7947

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page