Placeholder package to reserve the name. Real library coming soon.
Project description
OpenMed
OpenMed is a Python toolkit for working with the OpenMed collection of healthcare-focused named-entity recognition (NER) models on Hugging Face. It bundles configuration, model loading, advanced post-processing, and formatting utilities that mirror the behaviour of the OpenMed Gradio demos, making it easier to integrate clinical NER into scripts, services, or notebooks.
Status: The package is pre-release and the API may change. Feedback and contributions are welcome while the project stabilises.
Features
- Curated model registry with metadata for the OpenMed Hugging Face collection, including category filters, entity coverage, and confidence guidance.
- One-line model loading via
ModelLoader, with optional pipeline creation, caching, and authenticated access to private models. - Advanced NER post-processing (
AdvancedNERProcessor) that applies the filtering and grouping techniques proven in the OpenMed demos. - Text preprocessing & tokenisation helpers tailored for medical text workflows.
- Output formatting utilities that convert raw predictions into dict/JSON/HTML/CSV for downstream systems.
- Logging and validation helpers to keep pipelines observable and inputs safe.
Installation
Requirements
- Python 3.8 or newer (the package metadata allows 3.6+, but Hugging Face tooling typically requires >=3.8).
transformersand a compatible deep learning backend such as PyTorch.- An optional
HF_TOKENenvironment variable if you need to access gated models.
Install from PyPI
pip install openmed transformers
# Install a backend (PyTorch shown here; follow the instructions for your platform):
pip install torch --index-url https://download.pytorch.org/whl/cpu
If you plan to run on GPU, install the CUDA-enabled PyTorch wheels from the official instructions.
Quick start
from openmed.core import ModelLoader
from openmed.processing import format_predictions
loader = ModelLoader() # uses the default configuration
ner = loader.create_pipeline(
"disease_detection_superclinical", # registry key or full model ID
aggregation_strategy="simple", # group sub-token predictions for quick wins
)
text = "Patient diagnosed with acute lymphoblastic leukemia and started on imatinib."
raw_predictions = ner(text)
result = format_predictions(raw_predictions, text, model_name="Disease Detection")
for entity in result.entities:
print(f"{entity.label:<12} -> {entity.text} (confidence={entity.confidence:.2f})")
Discovering models
from openmed.core import ModelLoader
from openmed.core.model_registry import list_model_categories, get_models_by_category
loader = ModelLoader()
print(loader.list_available_models()[:5]) # Hugging Face + registry entries
suggestions = loader.get_model_suggestions(
"Metastatic breast cancer treated with paclitaxel and trastuzumab"
)
for key, info, reason in suggestions:
print(f"{info.display_name} -> {reason}")
print(list_model_categories())
for info in get_models_by_category("Oncology"):
print(f"- {info.display_name} ({info.model_id})")
Advanced NER processing
from openmed.core import ModelLoader
from openmed.processing.advanced_ner import create_advanced_processor
loader = ModelLoader()
# aggregation_strategy=None yields raw token-level predictions for maximum control
ner = loader.create_pipeline("pharma_detection_superclinical", aggregation_strategy=None)
text = "Administered 75mg clopidogrel daily alongside aspirin for secondary stroke prevention."
raw = ner(text)
processor = create_advanced_processor(confidence_threshold=0.65)
entities = processor.process_pipeline_output(text, raw)
summary = processor.create_entity_summary(entities)
for entity in entities:
print(f"{entity.label}: {entity.text} (score={entity.score:.3f})")
print(summary["by_type"])
Text preprocessing & tokenisation
from openmed.processing import TextProcessor, TokenizationHelper
from openmed.core import ModelLoader
text_processor = TextProcessor(normalize_whitespace=True, lowercase=False)
clean_text = text_processor.clean_text("BP 120/80, HR 88 bpm. Start Metformin 500mg bid.")
print(clean_text)
loader = ModelLoader()
model_data = loader.load_model("anatomy_detection_electramed")
token_helper = TokenizationHelper(model_data["tokenizer"])
encoding = token_helper.tokenize_with_alignment(clean_text)
print(encoding["tokens"][:10])
Formatting outputs
# Reuse `raw_predictions` and `text` from the quick start example
from openmed.processing import format_predictions
formatted = format_predictions(
raw_predictions,
text,
model_name="Disease Detection",
output_format="json",
include_confidence=True,
confidence_threshold=0.5,
)
print(formatted) # JSON string ready for logging or storage
format_predictions can also return CSV rows or rich HTML snippets for dashboards.
Configuration & logging
from openmed.core import OpenMedConfig, ModelLoader
from openmed.utils import setup_logging
config = OpenMedConfig(
default_org="OpenMed",
cache_dir="/tmp/openmed-cache",
device="cuda", # "cpu", "cuda", or a specific device index
)
setup_logging(level="INFO")
loader = ModelLoader(config=config)
OpenMedConfig automatically picks up HF_TOKEN from the environment so you can access
private or gated models without storing credentials in code.
Validation utilities
from openmed.utils.validation import validate_input, validate_model_name
text = validate_input(user_supplied_text, max_length=2000)
model = validate_model_name("OpenMed/OpenMed-NER-DiseaseDetect-SuperClinical-434M")
Use these helpers to guard API endpoints or batch pipelines against malformed inputs.
License
OpenMed is released under the Apache-2.0 License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openmed-0.1.4.tar.gz.
File metadata
- Download URL: openmed-0.1.4.tar.gz
- Upload date:
- Size: 28.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a481c6a8d107e0eb28f10f7d2f4f3fc00af9c15573b4e71c2a5075113a4a4a1e
|
|
| MD5 |
1e94458add311e8a6bc036125e056505
|
|
| BLAKE2b-256 |
9425b581acd2667e5335bc38096b805f3ca19ba92a232f76642dfc6d41220629
|
File details
Details for the file openmed-0.1.4-py3-none-any.whl.
File metadata
- Download URL: openmed-0.1.4-py3-none-any.whl
- Upload date:
- Size: 36.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fb18e39118132579afaddf351a4b81b74e77ce9dbc151e73ac38213ed062b4e
|
|
| MD5 |
8a4f81aa10d8d13cdd5dbe0d292ecb99
|
|
| BLAKE2b-256 |
00d61b2c106355306918c6fbf59667dcc80c3d49af047f2dacea26c3113c78f8
|