Skip to main content

A tool to extract BioConcept entities (e.g., genes, diseases, chemicals, and species) from Pubtator3 and generate a co-mention network for interactive use.

Project description

NetMedEx

Python package Doc

NetMedEx is a Python-based tool designed to extract BioConcept entities (e.g., genes, diseases, chemicals, and species) from Pubtator files generated by Pubtator3. It calculates the frequency of BioConcept pairs (e.g., gene-gene, gene-chemical, chemical-disease) based on co-mentions in publications and generates a co-mention interaction network. These networks can be viewed in a browser or imported into Cytoscape for advanced visualization and analysis.

Getting Started

NetMedEx offers four ways for users to interact with the tool:

  1. Web Application (via Docker)
  2. Web Application (Local)
  3. Command-Line Interface (CLI)
  4. Python API

For additional details, refer to the Documentation.

Web Application (via Docker)

If you have Docker installed on your machine, you can run the following command to launch the web application using Docker, then open localhost:8050 in your browser:

docker run -p 8050:8050 --rm lsbnb/netmedex

Installation

Install NetMedEx from PyPI to use the web application locally or access the CLI:

pip install netmedex

We recommend using Python version >= 3.11 for NetMedEx.

Web Application (Local)

After installing NetMedEx, run the following command and open localhost:8050 in your browser:

netmedex run

The sidebar parameters are detailed in the Available Commands section and Documentation.

Command-Line Interface (CLI)

To generate a network, run netmedex search first to retrieve relevant articles and then run netmedex network to generate the network.

Search PubMed Articles

Use the CLI to search articles containing specific biological concepts via the PubTator3 API:

# Query with keywords and sort articles by relevance (default: recency)
netmedex search -q '"N-dimethylnitrosamine" AND "Metformin"' [-o OUTPUT_FILEPATH] --sort score

# Query with article PMIDs
netmedex search -p 34895069,35883435,34205807 [-o OUTPUT_FILEPATH]

# Query with article PMIDs (from file)
netmedex search -f examples/pmids.txt [-o OUTPUT_FILEPATH]

# Query with PubTator3 Entity ID and limit the number of articles to 100
netmedex search -q '"@DISEASE_COVID_19" AND "@GENE_PON1"' [-o OUTPUT_FILEPATH] --max_articles 100

Note: Use double quotes for keywords containing spaces and logical operators (e.g., AND/OR) to combine keywords.

Available commands are detailed in Search Command.

Generate Co-Mention Networks

The PubTator file outputs from netmedex search is used to generate the network.

# Use default parameters and set edge weight cutoff to 1
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 1

# Keep MeSH terms and discard non-MeSH terms
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 1 --node_type mesh

# Keep confident relations between entities
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 1 --node_type relation

# Save the result in XGMML format for Cytoscape
netmedex network -i examples/pmids_output.pubtator -o pmids_output.xgmml -w 1 -f xgmml

# Use normalized pointwise mutual information (NPMI) to weight edges
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 5 --weighting_method npmi

Available commands are detailed in Network Command.

View the Network

  • HTML Output: Open in a browser to view the network.
  • XGMML Output: Import into Cytoscape for further analysis.

Refer to the Documentation for more details.

Available Commands

General

usage: netmedex [-h] {search,network,run} ...

positional arguments:
  {search,network,run}
    search              Search PubMed articles and obtain annotations
    network             Build a network from annotations
    run                 Run NetMedEx app

options:
  -h, --help            Show this help message and exit

Search Command

usage: netmedex search [-h] [-q QUERY] [-o OUTPUT] [-p PMIDS] [-f PMID_FILE] [-s {score,date}] [--max_articles MAX_ARTICLES] [--full_text]
                       [--use_mesh] [--debug]

options:
  -h, --help            show this help message and exit
  -q QUERY, --query QUERY
                        Query string
  -o OUTPUT, --output OUTPUT
                        Output path (default: [CURRENT_DIR].pubtator)
  -p PMIDS, --pmids PMIDS
                        PMIDs for the articles (comma-separated)
  -f PMID_FILE, --pmid_file PMID_FILE
                        Filepath to load PMIDs (one per line)
  -s {score,date}, --sort {score,date}
                        Sort articles in descending order by (default: date)
  --max_articles MAX_ARTICLES
                        Maximal articles to request from the searching result (default: 1000)
  --full_text           Collect full-text annotations if available
  --use_mesh            Use MeSH vocabulary instead of the most commonly used original text in articles
  --debug               Print debug information

Network Command

usage: netmedex network [-h] [-i INPUT] [-o OUTPUT] [-w CUT_WEIGHT] [-f {xgmml,html,json}] [--node_type {all,mesh,relation}]
                        [--weighting_method {freq,npmi}] [--pmid_weight PMID_WEIGHT] [--debug] [--community] [--max_edges MAX_EDGES]

options:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Path to the pubtator file
  -o OUTPUT, --output OUTPUT
                        Output path (default: [INPUT_DIR].[FORMAT_EXT])
  -w CUT_WEIGHT, --cut_weight CUT_WEIGHT
                        Discard the edges with weight smaller than the specified value (default: 2)
  -f {xgmml,html,json,pickle}, --format {xgmml,html,json,pickle}
                        Output format (default: html)
  --node_type {all,mesh,relation}
                        Keep specific types of nodes (default: all)
  --weighting_method {freq,npmi}
                        Weighting method for network edge (default: freq)
  --pmid_weight PMID_WEIGHT
                        CSV file for the weight of the edge from a PMID (default: 1)
  --debug               Print debug information
  --community           Divide nodes into distinct communities by the Louvain method
  --max_edges MAX_EDGES
                        Maximum number of edges to display (default: 0, no limit)

Package API

In addition to the web interface and CLI, NetMedEx can be used programmatically as a Python library. This allows for more flexible integration into custom pipelines and analysis workflows.

Example usage is available in notebooks/netmedex_usage.ipynb.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

netmedex-0.3.0.tar.gz (73.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

netmedex-0.3.0-py3-none-any.whl (79.2 kB view details)

Uploaded Python 3

File details

Details for the file netmedex-0.3.0.tar.gz.

File metadata

  • Download URL: netmedex-0.3.0.tar.gz
  • Upload date:
  • Size: 73.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for netmedex-0.3.0.tar.gz
Algorithm Hash digest
SHA256 7cbc855ca9ee1b7d18dbef30b55f815d84812dfe20dcffad8258b1acfc5559ad
MD5 9823783288fb16ef05da5a0e0ec2f21e
BLAKE2b-256 532cb8595db172892b22da427220a8c29948fdfdc7dad5bac244b5d10b19c699

See more details on using hashes here.

File details

Details for the file netmedex-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: netmedex-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 79.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for netmedex-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6a97e63306d14f80bf43ff92b29338bb76f83258aeed55d28409ca883b89be5d
MD5 415da642f4355594530e17fad0b49a04
BLAKE2b-256 001aecfbeb32275008689ff17b7c2650ba3e6902a28e3703a55eb4cd8cbf219d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page