Skip to main content

OMOP CDM utils in Python

Project description

pyomop

Libraries.io SourceRank forthebadge made-with-python PyPI download total Build Documentation

✨ Overview

pyomop is a Python library for working with OHDSI OMOP Common Data Model (CDM) v5.4 or v6 compliant databases using SQLAlchemy as the ORM. It supports converting query results to pandas DataFrames for machine learning pipelines and provides utilities for working with OMOP vocabularies. Table definitions are based on the omop-cdm library. Pyomop is designed to be a lightweight, easy-to-use library for researchers and developers experimenting and testing with OMOP CDM databases.

  • Supports SQLite, PostgreSQL, and MySQL. (All tables are in the default schema) (See usage below for more details)
  • LLM-based natural language queries via llama-index. Usage.
  • Execute QueryLibrary. (See usage below for more details)

Installation

Stable release:

pip install pyomop

Development version:

git clone https://github.com/dermatologist/pyomop.git
cd pyomop
pip install -e .

LLM support:

pip install pyomop[llm]

See llm_example.py for usage.

🔧 Usage

from pyomop import CdmEngineFactory, CdmVocabulary, CdmVector
# cdm6 and cdm54 are supported
from pyomop.cdm54 import Person, Cohort, Vocabulary, Base
from sqlalchemy.future import select
import datetime
import asyncio

async def main():
    cdm = CdmEngineFactory() # Creates SQLite database by default for fast testing
    # cdm = CdmEngineFactory(db='pgsql', host='', port=5432,
    #                       user='', pw='',
    #                       name='', schema='public')
    # cdm = CdmEngineFactory(db='mysql', host='', port=3306,
    #                       user='', pw='',
    #                       name='')
    engine = cdm.engine
    # Comment the following line if using an existing database. Both cdm6 and cdm54 are supported, see the import statements above
    await cdm.init_models(Base.metadata) # Initializes the database with the OMOP CDM tables
    vocab = CdmVocabulary(cdm, version='cdm54') # or 'cdm6' for v6
    # Uncomment the following line to create a new vocabulary from CSV files
    # vocab.create_vocab('/path/to/csv/files')
    async with cdm.session() as session:
        async with session.begin():
            session.add(Cohort(cohort_definition_id=2, subject_id=100,
                cohort_end_date=datetime.datetime.now(),
                cohort_start_date=datetime.datetime.now()))
            session.add(
                Person(
                    person_id=100,
                    gender_concept_id=8532,
                    gender_source_concept_id=8512,
                    year_of_birth=1980,
                    month_of_birth=1,
                    day_of_birth=1,
                    birth_datetime=datetime.datetime(1980, 1, 1),
                    race_concept_id=8552,
                    race_source_concept_id=8552,
                    ethnicity_concept_id=38003564,
                    ethnicity_source_concept_id=38003564,
                )
            )
        await session.commit()

        stmt = select(Cohort).where(Cohort.subject_id == 100)
        result = await session.execute(stmt)
        for row in result.scalars():
            print(row)

        cohort = await session.get(Cohort, 1)
        print(cohort)

        vec = CdmVector()

        # supports QueryLibrary queries
        # https://github.com/OHDSI/QueryLibrary/blob/master/inst/shinyApps/QueryLibrary/queries/person/PE02.md
        result = await vec.query_library(cdm, resource='person', query_name='PE02')
        df = vec.result_to_df(result)
        print("DataFrame from result:")
        print(df.head())

        result = await vec.execute(cdm, query='SELECT * from cohort;')
        print("Executing custom query:")
        df = vec.result_to_df(result)
        print("DataFrame from result:")
        print(df.head())

        # access sqlalchemy result directly
        for row in result:
            print(row)


    await session.close()
    await engine.dispose()

asyncio.run(main())

Command-line

pyomop -help

Additional Tools

  • Convert FHIR to pandas DataFrame: fhiry
  • .NET and Golang OMOP CDM: .NET, Golang

Supported Databases

  • PostgreSQL
  • MySQL
  • SQLite

Contributing

Pull requests are welcome! See CONTRIBUTING.md.

Contributors


⭐️ If you find this project useful!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pyomop-5.1.0-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file pyomop-5.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyomop-5.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pyomop-5.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7a2793b7d907d4688b8f77caf6ec16e422f19d89a0436f5b6064f9dca1efed82
MD5 3106a91c1a1be5ff0d9fcbd8e45588be
BLAKE2b-256 930cb89124ab60de44c915c6f3dd741a1ac0700c754685c3ba63ecfe72c92be9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page