OpenMed delivers state-of-the-art biomedical and clinical LLMs that rival proprietary enterprise stacks, unifying model discovery, advanced extractions, and one-line orchestration.

These details have not been verified by PyPI

Project description

OpenMed

Production-ready medical NLP toolkit powered by state-of-the-art transformers

Transform clinical text into structured insights with a single line of code. OpenMed delivers enterprise-grade entity extraction, assertion detection, and medical reasoning—no vendor lock-in, no compromise on accuracy.

from openmed import analyze_text

result = analyze_text(
    "Patient started on imatinib for chronic myeloid leukemia.",
    model_name="disease_detection_superclinical"
)

for entity in result.entities:
    print(f"{entity.label:<12} {entity.text:<35} {entity.confidence:.2f}")
# DISEASE      chronic myeloid leukemia            0.98
# DRUG         imatinib                            0.95

✨ Why OpenMed?

Specialized Models: 12+ curated medical NER models outperforming proprietary solutions
HIPAA-Compliant PII Detection: Smart de-identification with all 18 Safe Harbor identifiers
One-Line Deployment: From prototype to production in minutes
Dockerized REST API: FastAPI endpoints for service deployments
Batch Processing: Multi-file workflows with progress tracking
Production-Ready: Configuration profiles, profiling tools, and medical-aware tokenization
Zero Lock-In: Apache 2.0 licensed, runs on your infrastructure

Quick Start

Installation

# From a local checkout of this repo:
# Install with Hugging Face support
uv pip install -e ".[hf]"

# Or include REST service dependencies
uv pip install -e ".[hf,service]"

Apple Silicon acceleration in Python:

uv pip install -e ".[mlx]"

Swift apps on macOS and iOS use OpenMedKit. As of 1.4.0, that means:

MLX on Apple Silicon macOS and real iPhone/iPad hardware for supported OpenMed PII, OpenAI Privacy Filter, OpenAI Nemotron Privacy Filter, OpenMed Multilingual Privacy Filter, and experimental GLiNER-family artifacts
CoreML when you already have a bundled Apple model package or want the fallback Apple path

Add the Swift package like this:

dependencies: [
    .package(url: "https://github.com/maziyarpanahi/openmed.git", from: "1.4.0"),
]

OpenMedKit is public and now supports native MLX runtime paths for PII token classification, Privacy Filter, and experimental GLiNER-family zero-shot tasks. The broader OpenMed model-packaging flow is still being hardened across the full collection, so treat conversion as active work rather than a fully universal public release surface.

For published releases, the editable install examples above can be replaced with plain uv pip install "openmed[...]".

Three Ways to Use OpenMed

1️⃣ Python API — One-liner for scripts and notebooks

from openmed import analyze_text

result = analyze_text(
    "Patient received 75mg clopidogrel for NSTEMI.",
    model_name="pharma_detection_superclinical"
)

2️⃣ REST API Service — FastAPI endpoints for app backends

uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080

3️⃣ Batch Processing — Programmatic multi-document workflows

from openmed import BatchProcessor

processor = BatchProcessor(
    model_name="disease_detection_superclinical",
    confidence_threshold=0.55,
    group_entities=True,
)

result = processor.process_texts([
    "Patient started metformin for type 2 diabetes.",
    "Imatinib started for chronic myeloid leukemia.",
])

Key Features

Core Capabilities

Curated Model Registry: Metadata-rich catalog with 12+ specialized medical NER models
PII Detection & De-identification: HIPAA-compliant de-identification with smart entity merging
Medical-Aware Tokenization: Clean handling of clinical patterns (COVID-19, CAR-T, IL-6)
Advanced NER Processing: Confidence filtering, entity grouping, and span alignment
Multiple Output Formats: Dict, JSON, HTML, CSV for any downstream system

Production Tools (v1.4.0)

Batch Processing: Multi-text and multi-file workflows with progress tracking
Configuration Profiles: dev/prod/test/fast presets with flexible overrides
Performance Profiling: Built-in inference timing and bottleneck analysis
Dockerized REST API: GET /health, POST /analyze, POST /pii/extract, POST /pii/deidentify
Service Reliability Hardening: request validation, shared pipeline preload, and timeout/error envelopes

Documentation

Comprehensive guides available at openmed.life/docs

Quick links:

Getting Started — Installation and first analysis
Analyze Text Helper — Python API reference
MLX Backend — Apple Silicon Python runtime
OpenMedKit (Swift) — Use OpenMed models in macOS and iOS apps
PII Detection Guide — Complete de-identification tutorial (v0.5.0)
Batch Processing — Multi-text and multi-file workflows
Configuration Profiles — Environment-specific presets
REST Service — FastAPI and Docker usage
Model Registry — Browse available models
Configuration — Settings and environment variables

REST API

OpenMed includes a Docker-friendly FastAPI service with reliability hardening:

GET /health
POST /analyze
POST /pii/extract
POST /pii/deidentify

Run locally

uv pip install -e ".[hf,service]"
uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080

Optional shared model warm-up:

OPENMED_SERVICE_PRELOAD_MODELS=disease_detection_superclinical,OpenMed/OpenMed-PII-SuperClinical-Small-44M-v1 \
uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080

Run with Docker

docker build -t openmed:1.4.0 .
docker run --rm -p 8080:8080 -e OPENMED_PROFILE=prod openmed:1.4.0

Example request

curl -X POST http://127.0.0.1:8080/pii/extract \
  -H "Content-Type: application/json" \
  -d '{"text":"Paciente: Maria Garcia, DNI: 12345678Z","lang":"es"}'

See the full service guide at REST Service docs.

Non-2xx responses now use a unified envelope:

{
  "error": {
    "code": "validation_error",
    "message": "Request validation failed",
    "details": [
      {
        "field": "body.text",
        "message": "Text must not be blank",
        "type": "value_error"
      }
    ]
  }
}

Models

OpenMed includes a curated registry of 12+ specialized medical NER models:

Model	Specialization	Entity Types	Size
`disease_detection_superclinical`	Disease & Conditions	DISEASE, CONDITION, DIAGNOSIS	434M
`pharma_detection_superclinical`	Drugs & Medications	DRUG, MEDICATION, TREATMENT	434M
`pii_detection_superclinical`	PII & De-identification	NAME, DATE, SSN, PHONE, EMAIL, ADDRESS	434M
`anatomy_detection_electramed`	Anatomy & Body Parts	ANATOMY, ORGAN, BODY_PART	109M
`gene_detection_genecorpus`	Genes & Proteins	GENE, PROTEIN	109M

📖 Full Model Catalog

Advanced Usage

PII Detection & De-identification (v0.5.0)

from openmed import extract_pii, deidentify

# Extract PII entities with smart merging (default)
result = extract_pii(
    "Patient: John Doe, DOB: 01/15/1970, SSN: 123-45-6789",
    model_name="pii_detection_superclinical",
    use_smart_merging=True  # Prevents entity fragmentation
)

# De-identify with multiple methods
masked = deidentify(text, method="mask")        # [NAME], [DATE]
removed = deidentify(text, method="remove")     # Complete removal
replaced = deidentify(text, method="replace")   # Faker-backed locale-aware fakes
hashed = deidentify(text, method="hash")        # Cryptographic hashing
shifted = deidentify(text, method="shift_dates", date_shift_days=180)

# Deterministic obfuscation: same input -> same surrogate within doc
deidentify(text, method="replace", lang="pt", locale="pt_BR",
           consistent=True, seed=42)

Smart Entity Merging (NEW in v0.5.0): Fixes tokenization fragmentation by merging split entities like dates (01/15/1970 instead of 01 + /15/1970), ensuring production-ready de-identification.

Faker-backed obfuscation (v1.3.0): method="replace" uses Faker with custom providers for clinical IDs (CPF, CNPJ, BSN, NIR, Codice Fiscale, NIE, Aadhaar, Steuer-ID, NPI). Surrogates are locale-aware, format-preserving, and optionally deterministic. See Anonymization Guide.

HIPAA Compliance: Covers all 18 Safe Harbor identifiers with configurable confidence thresholds.

📓 Complete PII Notebook | 📖 Smart Merging | 📖 Anonymization

Privacy Filter Family (Public)

OpenMed ships three Privacy Filter families on the OpenAI Privacy Filter architecture — same model code (gpt-oss-style sparse-MoE transformer with local attention, sink tokens, RoPE+YaRN, tiktoken o200k_base tokenization), different training data:

Variant	Trained on	PyTorch (CPU + CUDA)	MLX full (OpenMedKit + Apple Silicon)	MLX 8-bit (OpenMedKit + Apple Silicon)
OpenAI Privacy Filter	OpenAI's PII training set	`openai/privacy-filter`	`OpenMed/privacy-filter-mlx`	`OpenMed/privacy-filter-mlx-8bit`
Nemotron-PII fine-tune	Nemotron PII dataset	`OpenMed/privacy-filter-nemotron`	`OpenMed/privacy-filter-nemotron-mlx`	`OpenMed/privacy-filter-nemotron-mlx-8bit`
OpenMed Multilingual Privacy Filter	OpenMed multilingual PII corpus with official support for 16 languages	`OpenMed/privacy-filter-multilingual`	`OpenMed/privacy-filter-multilingual-mlx`	`OpenMed/privacy-filter-multilingual-mlx-8bit`

All model IDs above route through the same extract_pii() / deidentify() API — only the model_name= argument changes.

The MLX artifacts above use the OpenMed MLX artifact layout consumed by OpenMedKit for native macOS, iOS, and iPadOS apps.

Install

The PyTorch path runs anywhere (Linux, macOS, Windows; CPU or CUDA):

pip install "openmed[hf]"

The MLX path adds Apple Silicon acceleration on top of [hf]:

pip install "openmed[mlx]"          # MLX runtime + tiktoken + huggingface-hub

tiktoken (the OpenAI Privacy Filter tokenizer) ships in the [mlx] extra.

Use it (PyTorch — Linux / Windows / non-Apple-Silicon macOS)

from openmed import extract_pii, deidentify

text = ("Patient Sarah Connor (DOB: 03/15/1985) at MRN 4471882. "
        "Email: sarah.connor@example.com, phone (415) 555-7012.")

# OpenAI baseline — runs on CPU/CUDA via Transformers
result = extract_pii(text, model_name="openai/privacy-filter")

# Nemotron-PII fine-tune — same code path, different weights
result = extract_pii(text, model_name="OpenMed/privacy-filter-nemotron")

# De-identify with any method
deidentify(text, model_name="OpenMed/privacy-filter-nemotron", method="mask")
deidentify(text, model_name="OpenMed/privacy-filter-nemotron",
           method="replace", consistent=True, seed=42)

Use it (MLX — Apple Silicon)

from openmed import extract_pii, deidentify

text = "Patient Sarah Connor born on 03/15/1985, MRN 4471882."

# OpenAI baseline (full / 8-bit MLX artifacts)
extract_pii(text, model_name="OpenMed/privacy-filter-mlx")
extract_pii(text, model_name="OpenMed/privacy-filter-mlx-8bit")

# Nemotron-PII fine-tune (full / 8-bit MLX artifacts)
extract_pii(text, model_name="OpenMed/privacy-filter-nemotron-mlx")
extract_pii(text, model_name="OpenMed/privacy-filter-nemotron-mlx-8bit")

# OpenMed Multilingual Privacy Filter (full / 8-bit MLX artifacts)
extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx")
extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")

Cross-platform note

The MLX artifact names work everywhere — on a non-Apple-Silicon host (or anywhere MLX isn't installed) the request is automatically substituted with the matching PyTorch model and a one-time UserWarning names the substitution. The substitution is family-aware:

OpenMed/privacy-filter-mlx* ⇒ falls back to openai/privacy-filter
OpenMed/privacy-filter-nemotron-mlx* ⇒ falls back to OpenMed/privacy-filter-nemotron
OpenMed/privacy-filter-multilingual-mlx* ⇒ falls back to OpenMed/privacy-filter-multilingual

So your code can ship an MLX model name and run on any host without changes — Apple Silicon users get MLX speed, everyone else gets the same family's PyTorch checkpoint.

📖 Privacy Filter Architecture & Backend Routing | ▶ Side-by-side example | ▶ Faker obfuscation demo

Multilingual PII (9 Languages)

OpenMed now supports multilingual PII extraction and de-identification across en, fr, de, it, es, nl, hi, te, and pt. French, German, Italian, and Spanish expose the full 35-model family; Portuguese ships 31 public API-visible models; Dutch, Hindi, and Telugu currently ship one flagship public model each, bringing the total PII catalog to 210 models.

uv pip install "openmed[hf]" && python -c "from openmed import extract_pii; print([(e.label,e.text) for e in extract_pii('Dr. Pedro Almeida, CPF: 123.456.789-09, email: pedro@hospital.pt, tel: +351 912 345 678', lang='pt').entities])"

from openmed import extract_pii

portuguese = extract_pii(
    "Paciente: Pedro Almeida, CPF: 123.456.789-09, email: pedro@hospital.pt, telefone: +351 912 345 678",
    lang="pt",
    model_name="OpenMed/OpenMed-PII-Portuguese-SnowflakeMed-Large-568M-v1",
    use_smart_merging=True,
)

dutch = extract_pii(
    "Patiënt: Eva de Vries, geboortedatum: 15 januari 1984, BSN: 123456782, telefoon: +31 6 12345678",
    lang="nl",
    model_name="OpenMed/OpenMed-PII-Dutch-SuperClinical-Large-434M-v1",
    use_smart_merging=True,
)

hindi = extract_pii(
    "रोगी: अनीता शर्मा, जन्मतिथि: 15 जनवरी 1984, फोन: +91 9876543210, पता: 12 गली संख्या 5, नई दिल्ली 110001",
    lang="hi",
    model_name="OpenMed/OpenMed-PII-Hindi-SuperClinical-Large-434M-v1",
    use_smart_merging=True,
)

telugu = extract_pii(
    "రోగి: సితా రెడ్డి, జన్మ తేదీ: 15 జనవరి 1984, ఫోన్: +91 9876543210, చిరునామా: 12 వీధి 5, హైదరాబాద్ 500001",
    lang="te",
    model_name="OpenMed/OpenMed-PII-Telugu-SuperClinical-Large-434M-v1",
    use_smart_merging=True,
)

print([(e.label, e.text) for e in portuguese.entities])
print([(e.label, e.text) for e in dutch.entities])
print([(e.label, e.text) for e in hindi.entities])
print([(e.label, e.text) for e in telugu.entities])

Batch Processing

from openmed import BatchProcessor, OpenMedConfig

config = OpenMedConfig.from_profile("prod")
processor = BatchProcessor(
    model_name="disease_detection_superclinical",
    config=config,
    group_entities=True,
)

result = processor.process_texts([
    "Metastatic breast cancer treated with trastuzumab.",
    "Acute lymphoblastic leukemia diagnosed.",
])

Configuration Profiles

from openmed import analyze_text

# Apply a profile programmatically
result = analyze_text(
    text,
    model_name="disease_detection_superclinical",
    config_profile="prod"  # High confidence, grouped entities
)

Performance Profiling

from openmed import analyze_text, profile_inference

with profile_inference() as profiler:
    result = analyze_text(text, model_name="disease_detection_superclinical")

print(profiler.summary())  # Inference time, bottlenecks, recommendations

📖 More Examples

Contributing

We welcome contributions! Whether it's bug reports, feature requests, or pull requests.

🐛 Found a bug? Open an issue

Credits & Acknowledgements

OpenMed builds on excellent open-source work from the community. Particular thanks to:

OpenAI for open-sourcing the Privacy Filter model. The OpenAI Privacy Filter architecture (gpt-oss-style sparse-MoE transformer with local attention, sink tokens, RoPE+YaRN, tiktoken o200k_base tokenization) is the foundation of OpenMed's privacy-filter family — both the upstream baseline and our Nemotron-PII fine-tunes share this architecture.
NVIDIA for releasing the Nemotron PII dataset, which we used to fine-tune the OpenAI Privacy Filter weights into the OpenMed/privacy-filter-nemotron family (MLX, MLX 8-bit).
HuggingFace for transformers, tokenizers, huggingface_hub, and the broader model-distribution ecosystem.
Apple for MLX, which powers OpenMed's Apple Silicon acceleration.
The Faker maintainers for Faker and its community-contributed locale providers, which power OpenMed's locale-aware obfuscation engine.

License

OpenMed is released under the Apache-2.0 License.

Citation

If you use OpenMed in your research, please cite:

@misc{panahi2025openmedneropensourcedomainadapted,
      title={OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets},
      author={Maziyar Panahi},
      year={2025},
      eprint={2508.01630},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.01630},
}

Star History

If you find OpenMed useful, consider giving it a star ⭐ to help others discover it!

Built with ❤️ by the OpenMed team

🌐 Website • 📚 Documentation • 🐦 X/Twitter • 💬 LinkedIn

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.4.0

May 11, 2026

1.3.0

Apr 29, 2026

1.2.0

Apr 24, 2026

1.1.0

Apr 21, 2026

1.0.0

Apr 14, 2026

0.6.4

Mar 27, 2026

0.6.3

Mar 19, 2026

0.6.2

Mar 10, 2026

0.6.1

Mar 7, 2026

0.6.0

Feb 28, 2026

0.5.8

Feb 19, 2026

0.5.7

Feb 18, 2026

0.5.6

Feb 18, 2026

0.5.5

Feb 11, 2026

0.5.1

Jan 14, 2026

0.5.0

Jan 14, 2026

0.4.0

Dec 29, 2025

0.3.0

Dec 26, 2025

0.2.2

Dec 20, 2025

0.2.1

Dec 10, 2025

0.2.0

Oct 29, 2025

0.2.0rc2 pre-release

Nov 6, 2025

0.2.0rc1 pre-release

Oct 29, 2025

0.1.10

Oct 17, 2025

0.1.10rc1 pre-release

Oct 17, 2025

0.1.9

Oct 17, 2025

0.1.9rc1 pre-release

Oct 17, 2025

0.1.8

Oct 17, 2025

0.1.8rc2 pre-release

Oct 17, 2025

0.1.8rc0 pre-release

Oct 17, 2025

0.1.7

Oct 12, 2025

0.1.5

Oct 6, 2025

0.1.4

Oct 6, 2025

0.1.1

Sep 10, 2025

0.1

Aug 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmed-1.4.0.tar.gz (163.0 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openmed-1.4.0-py3-none-any.whl (206.2 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file openmed-1.4.0.tar.gz.

File metadata

Download URL: openmed-1.4.0.tar.gz
Upload date: May 11, 2026
Size: 163.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: Hatch/1.16.5 cpython/3.11.15 HTTPX/0.28.1

File hashes

Hashes for openmed-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`d15e22ef93e4df10d3e5c4409a3334291911e0e2098dc67bbe3ea8d733d2c6f6`
MD5	`0f593eb7816ef4568795693bb11936c7`
BLAKE2b-256	`1dfe3c4a5d51e31f71f935d9524d8d2095fb22fbf4db843cf297041366c9e443`

See more details on using hashes here.

File details

Details for the file openmed-1.4.0-py3-none-any.whl.

File metadata

Download URL: openmed-1.4.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 206.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: Hatch/1.16.5 cpython/3.11.15 HTTPX/0.28.1

File hashes

Hashes for openmed-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6f82898796ca76af767c34171f2680242e9eee0745cc49bf3933cb9d77ef78a9`
MD5	`b8c011db77dc3bcacdb94af3db0cb075`
BLAKE2b-256	`989bd1f28453deb0e0b99db45b5c8c62fa9ada62f6cf767abe19638ae13e2b1e`

See more details on using hashes here.

openmed 1.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

OpenMed

✨ Why OpenMed?

Quick Start

Installation

Three Ways to Use OpenMed

Key Features

Core Capabilities

Production Tools (v1.4.0)

Documentation

REST API

Run locally

Run with Docker

Example request

Models

Advanced Usage

PII Detection & De-identification (v0.5.0)

Privacy Filter Family (Public)

Install

Use it (PyTorch — Linux / Windows / non-Apple-Silicon macOS)

Use it (MLX — Apple Silicon)

Cross-platform note

Multilingual PII (9 Languages)

Batch Processing

Configuration Profiles

Performance Profiling

Contributing

Credits & Acknowledgements

License

Citation

Star History

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes