OpenMed delivers state-of-the-art biomedical and clinical LLMs that rival proprietary enterprise stacks, unifying model discovery, advanced extractions, and one-line orchestration.

Project description

OpenMed

OpenMed is a Python toolkit for biomedical and clinical NLP, built to deliver state-of-the-art models, including advanced large language models (LLMs) for healthcare, that rival and often outperform proprietary enterprise solutions. It unifies model discovery, assertion status detection, de-identification pipelines, advanced extraction and reasoning tools, and one-line orchestration for scripts, services, or notebooks, enabling teams to deploy production-grade healthcare AI without vendor lock-in.

It also bundles configuration management, model loading, support for cutting-edge medical LLMs, post-processing, and formatting utilities — making it seamless to integrate clinical AI into existing scripts, services, and research workflows.

Status: The package is pre-release and the API may change. Feedback and contributions are welcome while the project stabilises.

Features

Curated model registry with metadata for the OpenMed Hugging Face collection, including category filters, entity coverage, and confidence guidance.
One-line model loading via ModelLoader, with optional pipeline creation, caching, and authenticated access to private models.
Advanced NER post-processing (AdvancedNERProcessor) that applies the filtering and grouping techniques proven in the OpenMed demos.
Text preprocessing & tokenisation helpers tailored for medical text workflows.
Output formatting utilities that convert raw predictions into dict/JSON/HTML/CSV for downstream systems.
Logging and validation helpers to keep pipelines observable and inputs safe.

Installation

Requirements

Python 3.8 or newer (the package metadata allows 3.6+, but Hugging Face tooling typically requires >=3.8).
transformers and a compatible deep learning backend such as PyTorch.
An optional HF_TOKEN environment variable if you need to access gated models.

Install from PyPI

pip install openmed transformers
# Install a backend (PyTorch shown here; follow the instructions for your platform):
pip install torch --index-url https://download.pytorch.org/whl/cpu

If you plan to run on GPU, install the CUDA-enabled PyTorch wheels from the official instructions.

Quick start

from openmed.core import ModelLoader
from openmed.processing import format_predictions

loader = ModelLoader()  # uses the default configuration
ner = loader.create_pipeline(
    "disease_detection_superclinical",  # registry key or full model ID
    aggregation_strategy="simple",      # group sub-token predictions for quick wins
)

text = "Patient diagnosed with acute lymphoblastic leukemia and started on imatinib."
raw_predictions = ner(text)

result = format_predictions(raw_predictions, text, model_name="Disease Detection")
for entity in result.entities:
    print(f"{entity.label:<12} -> {entity.text} (confidence={entity.confidence:.2f})")

Use the convenience helper if you prefer a single call:

from openmed import analyze_text

result = analyze_text(
    "Patient received 75mg clopidogrel for NSTEMI.",
    model_name="pharma_detection_superclinical"
)

for entity in result.entities:
    print(entity)

Command-line usage

Install the package in the usual way and the openmed console command will be available. It provides quick access to model discovery, text analysis, and configuration management.

# List models from the bundled registry (add --include-remote for Hugging Face)
openmed models list
openmed models list --include-remote

# Analyse inline text or a file with a specific model
openmed analyze --model disease_detection_superclinical --text "Acute leukemia treated with imatinib."

# Inspect or edit the CLI configuration (defaults to ~/.config/openmed/config.toml)
openmed config show
openmed config set device cuda

# Inspect the model's inferred context window
openmed models info disease_detection_superclinical

Provide --config-path /custom/path.toml to work with a different configuration file during automation or testing. Run openmed --help to see all options.

Discovering models

from openmed.core import ModelLoader
from openmed.core.model_registry import list_model_categories, get_models_by_category

loader = ModelLoader()
print(loader.list_available_models()[:5])  # Hugging Face + registry entries

suggestions = loader.get_model_suggestions(
    "Metastatic breast cancer treated with paclitaxel and trastuzumab"
)
for key, info, reason in suggestions:
    print(f"{info.display_name} -> {reason}")

print(list_model_categories())
for info in get_models_by_category("Oncology"):
    print(f"- {info.display_name} ({info.model_id})")

from openmed import get_model_max_length
print(get_model_max_length("disease_detection_superclinical"))

Or use the top-level helper:

from openmed import list_models

print(list_models()[:10])

Advanced NER processing

from openmed.core import ModelLoader
from openmed.processing.advanced_ner import create_advanced_processor

loader = ModelLoader()
# aggregation_strategy=None yields raw token-level predictions for maximum control
ner = loader.create_pipeline("pharma_detection_superclinical", aggregation_strategy=None)

text = "Administered 75mg clopidogrel daily alongside aspirin for secondary stroke prevention."
raw = ner(text)

processor = create_advanced_processor(confidence_threshold=0.65)
entities = processor.process_pipeline_output(text, raw)
summary = processor.create_entity_summary(entities)

for entity in entities:
    print(f"{entity.label}: {entity.text} (score={entity.score:.3f})")

print(summary["by_type"])

Text preprocessing & tokenisation

from openmed.processing import TextProcessor, TokenizationHelper
from openmed.core import ModelLoader

text_processor = TextProcessor(normalize_whitespace=True, lowercase=False)
clean_text = text_processor.clean_text("BP 120/80, HR 88 bpm. Start Metformin 500mg bid.")
print(clean_text)

loader = ModelLoader()
model_data = loader.load_model("anatomy_detection_electramed")
token_helper = TokenizationHelper(model_data["tokenizer"])
encoding = token_helper.tokenize_with_alignment(clean_text)
print(encoding["tokens"][:10])

Formatting outputs

# Reuse `raw_predictions` and `text` from the quick start example
from openmed.processing import format_predictions

formatted = format_predictions(
    raw_predictions,
    text,
    model_name="Disease Detection",
    output_format="json",
    include_confidence=True,
    confidence_threshold=0.5,
)
print(formatted)  # JSON string ready for logging or storage

format_predictions can also return CSV rows or rich HTML snippets for dashboards.

Configuration & logging

from openmed.core import OpenMedConfig, ModelLoader
from openmed.utils import setup_logging

config = OpenMedConfig(
    default_org="OpenMed",
    cache_dir="/tmp/openmed-cache",
    device="cuda",  # "cpu", "cuda", or a specific device index
)
setup_logging(level="INFO")
loader = ModelLoader(config=config)

OpenMedConfig automatically picks up HF_TOKEN from the environment so you can access private or gated models without storing credentials in code.

Validation utilities

from openmed.utils.validation import validate_input, validate_model_name

text = validate_input(user_supplied_text, max_length=2000)
model = validate_model_name("OpenMed/OpenMed-NER-DiseaseDetect-SuperClinical-434M")

Use these helpers to guard API endpoints or batch pipelines against malformed inputs.

License

OpenMed is released under the Apache-2.0 License.

Citing

If you use OpenMed in your research, please cite:

@misc{panahi2025openmedneropensourcedomainadapted,
      title={OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets},
      author={Maziyar Panahi},
      year={2025},
      eprint={2508.01630},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.01630},
}

Project details

Release history Release notifications | RSS feed

1.2.0

Apr 24, 2026

1.1.0

Apr 21, 2026

1.0.0

Apr 14, 2026

0.6.4

Mar 27, 2026

0.6.3

Mar 19, 2026

0.6.2

Mar 10, 2026

0.6.1

Mar 7, 2026

0.6.0

Feb 28, 2026

0.5.8

Feb 19, 2026

0.5.7

Feb 18, 2026

0.5.6

Feb 18, 2026

0.5.5

Feb 11, 2026

0.5.1

Jan 14, 2026

0.5.0

Jan 14, 2026

0.4.0

Dec 29, 2025

0.3.0

Dec 26, 2025

0.2.2

Dec 20, 2025

0.2.1

Dec 10, 2025

0.2.0

Oct 29, 2025

0.2.0rc2 pre-release

Nov 6, 2025

0.2.0rc1 pre-release

Oct 29, 2025

0.1.10

Oct 17, 2025

0.1.10rc1 pre-release

Oct 17, 2025

0.1.9

Oct 17, 2025

0.1.9rc1 pre-release

Oct 17, 2025

0.1.8

Oct 17, 2025

This version

0.1.8rc2 pre-release

Oct 17, 2025

0.1.8rc0 pre-release

Oct 17, 2025

0.1.7

Oct 12, 2025

0.1.5

Oct 6, 2025

0.1.4

Oct 6, 2025

0.1.1

Sep 10, 2025

0.1

Aug 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmed-0.1.8rc2.tar.gz (36.2 kB view details)

Uploaded Oct 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openmed-0.1.8rc2-py3-none-any.whl (47.0 kB view details)

Uploaded Oct 17, 2025 Python 3

File details

Details for the file openmed-0.1.8rc2.tar.gz.

File metadata

Download URL: openmed-0.1.8rc2.tar.gz
Upload date: Oct 17, 2025
Size: 36.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-httpx/0.28.1

File hashes

Hashes for openmed-0.1.8rc2.tar.gz
Algorithm	Hash digest
SHA256	`296cbcffa43e4e05d85b997a5b565262b8a33fb77a9e426ac41c6f8eff6ef2fe`
MD5	`3d89b35010a2992d64266b3673cd090c`
BLAKE2b-256	`15807669e2e8d8dd8a4d78d4f67aaf7b4d35c90e6c982b2563b3ad79d260a242`

See more details on using hashes here.

File details

Details for the file openmed-0.1.8rc2-py3-none-any.whl.

File metadata

Download URL: openmed-0.1.8rc2-py3-none-any.whl
Upload date: Oct 17, 2025
Size: 47.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-httpx/0.28.1

File hashes

Hashes for openmed-0.1.8rc2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`036b0979982e3e91939730499db4bdc2e129fb45d85b682e5a2b8800ff50e3a9`
MD5	`d061ee802eda02a172d28dd7e64c4be1`
BLAKE2b-256	`1e94e485d65e4aeb2dae89b16bae3de12809cad1a3941f622b2d96a769402aae`

See more details on using hashes here.

openmed 0.1.8rc2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

OpenMed

Features

Installation

Requirements

Install from PyPI

Quick start

Command-line usage

Discovering models

Advanced NER processing

Text preprocessing & tokenisation

Formatting outputs

Configuration & logging

Validation utilities

License

Citing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes