Python SDK for OMOPHub - Medical Vocabulary API with semantic search

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

alex-omophub

These details have not been verified by PyPI

Project links

Project description

OMOPHub Python SDK

Query millions standardized medical concepts via simple Python API

Access SNOMED CT, ICD-10, RxNorm, LOINC, and 90+ OHDSI ATHENA vocabularies without downloading, installing, or maintaining local databases.

Downloads

Documentation · API Reference · Examples

Why OMOPHub?

Working with OHDSI ATHENA vocabularies traditionally requires downloading multi-gigabyte files, setting up a database instance, and writing complex SQL queries. OMOPHub eliminates this friction.

Traditional Approach	With OMOPHub
Download 5GB+ ATHENA vocabulary files	`pip install omophub`
Set up and maintain database	One API call
Write complex SQL with multiple JOINs	Simple Python methods
Manually update vocabularies quarterly	Always current data
Local infrastructure required	Works anywhere Python runs

Installation

pip install omophub

Quick Start

from omophub import OMOPHub

# Initialize client (uses OMOPHUB_API_KEY env variable, or pass api_key="...")
client = OMOPHub()

# Get a concept by ID
concept = client.concepts.get(201826)
print(concept["concept_name"])  # "Type 2 diabetes mellitus"

# Search for concepts across vocabularies
results = client.search.basic("metformin", vocabulary_ids=["RxNorm"], domain_ids=["Drug"])
for c in results["concepts"]:
    print(f"{c['concept_id']}: {c['concept_name']}")

# Map ICD-10 code to SNOMED
mappings = client.mappings.get_by_code("ICD10CM", "E11.9", target_vocabulary="SNOMED")

# Navigate concept hierarchy
ancestors = client.hierarchy.ancestors(201826, max_levels=3)

Semantic Search

Use natural language queries to find concepts using neural embeddings:

# Natural language search - understands clinical intent
results = client.search.semantic("high blood sugar levels")
for r in results["results"]:
    print(f"{r['concept_name']} (similarity: {r['similarity_score']:.2f})")

# Filter by vocabulary and set minimum similarity threshold
results = client.search.semantic(
    "heart attack",
    vocabulary_ids=["SNOMED"],
    domain_ids=["Condition"],
    threshold=0.5
)

# Iterate through all results with auto-pagination
for result in client.search.semantic_iter("chronic kidney disease", page_size=50):
    print(f"{result['concept_id']}: {result['concept_name']}")

Similarity Search

Find concepts similar to a known concept or natural language query:

# Find concepts similar to a known concept
results = client.search.similar(concept_id=201826, algorithm="hybrid")
for r in results["results"]:
    print(f"{r['concept_name']} (score: {r['similarity_score']:.2f})")

# Find similar concepts using a natural language query
results = client.search.similar(
    query="medications for high blood pressure",
    algorithm="semantic",
    similarity_threshold=0.6,
    vocabulary_ids=["RxNorm"],
    include_scores=True,
)

Async Support

import asyncio
from omophub import AsyncOMOPHub

async def main():
    async with AsyncOMOPHub() as client:
        concept = await client.concepts.get(201826)
        print(concept["concept_name"])

asyncio.run(main())

Use Cases

ETL & Data Pipelines

Validate and map clinical codes during OMOP CDM transformations:

# Validate that a source code exists and find its standard equivalent
def validate_and_map(source_vocab, source_code):
    concept = client.concepts.get_by_code(source_vocab, source_code)
    if concept["standard_concept"] != "S":
        mappings = client.mappings.get(concept["concept_id"],
                                        target_vocabulary="SNOMED")
        return mappings["mappings"][0]["target_concept_id"]
    return concept["concept_id"]

Data Quality Checks

Verify codes exist and are valid standard concepts:

# Check if all your condition codes are valid
condition_codes = ["E11.9", "I10", "J44.9"]  # ICD-10 codes
for code in condition_codes:
    try:
        concept = client.concepts.get_by_code("ICD10CM", code)
        print(f"OK {code}: {concept['concept_name']}")
    except omophub.NotFoundError:
        print(f"ERROR {code}: Invalid code!")

Phenotype Development

Explore hierarchies to build comprehensive concept sets:

# Get all descendants of "Type 2 diabetes mellitus" for phenotype
descendants = client.hierarchy.descendants(201826, max_levels=5)
concept_set = [d["concept_id"] for d in descendants["concepts"]]
print(f"Found {len(concept_set)} concepts for T2DM phenotype")

Clinical Applications

Build terminology lookups into healthcare applications:

# Autocomplete for clinical coding interface
suggestions = client.concepts.suggest("diab", vocabulary_ids=["SNOMED"], page_size=10)
# Returns: ["Diabetes mellitus", "Diabetic nephropathy", "Diabetic retinopathy", ...]

API Resources

Resource	Description	Key Methods
`concepts`	Concept lookup and batch operations	`get()`, `get_by_code()`, `batch()`, `suggest()`
`search`	Full-text and semantic search	`basic()`, `advanced()`, `semantic()`, `semantic_iter()`, `similar()`, `fuzzy()`
`hierarchy`	Navigate concept relationships	`ancestors()`, `descendants()`
`mappings`	Cross-vocabulary mappings	`get()`, `map()`
`vocabularies`	Vocabulary metadata	`list()`, `get()`, `stats()`
`domains`	Domain information	`list()`, `get()`, `concepts()`

Configuration

client = OMOPHub(
    api_key="oh_xxx",                        # Or set OMOPHUB_API_KEY env var
    base_url="https://api.omophub.com/v1",   # API endpoint
    timeout=30.0,                             # Request timeout (seconds)
    max_retries=3,                            # Retry attempts
    vocab_version="2025.2",                   # Specific vocabulary version
)

Error Handling

import omophub

try:
    concept = client.concepts.get(999999999)
except omophub.NotFoundError as e:
    print(f"Concept not found: {e.message}")
except omophub.AuthenticationError as e:
    print(f"Check your API key: {e.message}")
except omophub.RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after} seconds")
except omophub.APIError as e:
    print(f"API error {e.status_code}: {e.message}")

Type Safety

The SDK is fully typed with TypedDict definitions for IDE autocomplete:

from omophub import OMOPHub, Concept

client = OMOPHub()
concept: Concept = client.concepts.get(201826)

# IDE autocomplete works for all fields
concept["concept_id"]      # int
concept["concept_name"]    # str
concept["vocabulary_id"]   # str
concept["domain_id"]       # str
concept["concept_class_id"] # str

Integration Examples

With Pandas

import pandas as pd

# Search and load into DataFrame
results = client.search.basic("hypertension", page_size=100)
df = pd.DataFrame(results["concepts"])
print(df[["concept_id", "concept_name", "vocabulary_id"]].head())

In Jupyter Notebooks

# Iterate through all results with auto-pagination
for concept in client.search.basic_iter("diabetes", page_size=100):
    process_concept(concept)

Compared to Alternatives

Feature	OMOPHub SDK	ATHENA Download	OHDSI WebAPI
Setup time	1 minute	Hours	Hours
Infrastructure	None	Database required	Full OHDSI stack
Updates	Automatic	Manual download	Manual
Programmatic access	Native Python	SQL queries	REST API

Best for: Teams who need quick, programmatic access to OMOP vocabularies without infrastructure overhead.

Documentation

Contributing

We welcome contributions! Please see our Contributing Guide for details.

# Clone and install for development
git clone https://github.com/omopHub/omophub-python.git
cd omophub-python
pip install -e ".[dev]"

# Run tests
pytest

Support

License

MIT License - see LICENSE for details.

Built for the OHDSI community

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

alex-omophub

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.7.0

Apr 14, 2026

1.6.0

Apr 10, 2026

1.5.1

Apr 8, 2026

1.5.0

Mar 26, 2026

1.4.1

Feb 28, 2026

This version

1.4.0

Feb 23, 2026

1.3.1

Jan 24, 2026

1.3.0

Jan 6, 2026

1.2.0

Dec 9, 2025

1.0.1

Dec 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omophub-1.4.0.tar.gz (29.5 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omophub-1.4.0-py3-none-any.whl (33.9 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file omophub-1.4.0.tar.gz.

File metadata

Download URL: omophub-1.4.0.tar.gz
Upload date: Feb 23, 2026
Size: 29.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omophub-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`250c83b36917009f97e5ad6c9dad9777d6c945a3102381f6bb6c457b94de81d4`
MD5	`6796382de34c733067fe89639653cfb8`
BLAKE2b-256	`f98b5f856f3033b0a37168470dbee39f3f7e6377fd61c508de228d20be4dbdbb`

See more details on using hashes here.

Provenance

The following attestation bundles were made for omophub-1.4.0.tar.gz:

Publisher: publish.yml on OMOPHub/omophub-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: omophub-1.4.0.tar.gz
- Subject digest: 250c83b36917009f97e5ad6c9dad9777d6c945a3102381f6bb6c457b94de81d4
- Sigstore transparency entry: 980396293
- Sigstore integration time: Feb 23, 2026
Source repository:
- Permalink: OMOPHub/omophub-python@2a3d387fb9e786665a41818e5666519e791ce09a
- Branch / Tag: refs/tags/v1.4.0
- Owner: https://github.com/OMOPHub
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2a3d387fb9e786665a41818e5666519e791ce09a
- Trigger Event: release

File details

Details for the file omophub-1.4.0-py3-none-any.whl.

File metadata

Download URL: omophub-1.4.0-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 33.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omophub-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4d414001e3baa024399d7576fc65f990abad282fdc43042f89d20b47b8239927`
MD5	`f4eda68b93b2381d68aaa4116f0f2e9c`
BLAKE2b-256	`2099d598b92684f5a9955d17d334ec67b297b829a2915f8782a795043893fa0e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for omophub-1.4.0-py3-none-any.whl:

Publisher: publish.yml on OMOPHub/omophub-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: omophub-1.4.0-py3-none-any.whl
- Subject digest: 4d414001e3baa024399d7576fc65f990abad282fdc43042f89d20b47b8239927
- Sigstore transparency entry: 980396376
- Sigstore integration time: Feb 23, 2026
Source repository:
- Permalink: OMOPHub/omophub-python@2a3d387fb9e786665a41818e5666519e791ce09a
- Branch / Tag: refs/tags/v1.4.0
- Owner: https://github.com/OMOPHub
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2a3d387fb9e786665a41818e5666519e791ce09a
- Trigger Event: release

omophub 1.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OMOPHub Python SDK

Why OMOPHub?

Installation

Quick Start

Semantic Search

Similarity Search

Async Support

Use Cases

ETL & Data Pipelines

Data Quality Checks

Phenotype Development

Clinical Applications

API Resources

Configuration

Error Handling

Type Safety

Integration Examples

With Pandas

In Jupyter Notebooks

Compared to Alternatives

Documentation

Contributing

Support

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance