Skip to main content

A package to read in and convert OMOP data to the right schema.

Project description

OMOP Schema

PyPI - Version codecov Tests Code Quality Python License PRs Welcome Contributors

omop_schema is a Python package designed to read, manage, and convert OMOP (Observational Medical Outcomes Partnership) data into the correct schema. It provides tools to handle OMOP CDM (Common Data Model) tables, convert schemas between different formats (e.g., PyArrow, Polars, Pandas), and load datasets efficiently.

Features

  • Schema Management: Define and manage schemas for OMOP CDM tables for different versions (e.g., 5.3, v5.4).
  • Schema Conversion: Convert PyArrow schemas to Polars or Pandas-compatible schemas.
  • Dataset Loading: Load datasets from CSV files into PyArrow tables, ensuring they match the defined schema.
  • Optional Dependencies: Support for Polars and Pandas as optional dependencies for schema conversion.

Installation

Install the package using pip:

pip install omop_schema

To include optional dependencies:

pip install omop_schema[polars]
pip install omop_schema[pandas]
pip install omop_schema[polars,pandas]

Usage

1. Define and Retrieve OMOP Schemas

The package provides predefined schemas for OMOP CDM tables. You can retrieve the schema for a specific table:

from omop_schema.schema.v54 import OMOPSchemaV54

schema_v54 = OMOPSchemaV54()
concept_schema = schema_v54.get_pyarrow_schema("concept")
print(concept_schema)

2. Load Datasets

You can load datasets from a folder containing CSV files. The files are matched to the predefined schemas:

from omop_schema.schema.v54 import OMOPSchemaV54

schema_v54 = OMOPSchemaV54()
datasets = schema_v54.load_csv_dataset("path/to/csv/folder")

# Access a specific table
concept_table = datasets["concept"]
print(concept_table)

3. Convert PyArrow Schema to Polars Schema

If Polars is installed, you can convert a PyArrow schema to a Polars-compatible schema:

from omop_schema.utils import pyarrow_to_polars_schema
import pyarrow as pa

arrow_schema = pa.schema(
    [
        pa.field("column1", pa.int64()),
        pa.field("column2", pa.string()),
    ]
)

polars_schema = pyarrow_to_polars_schema(arrow_schema)
print(polars_schema)

4. Convert PyArrow Schema to Pandas Schema

If Pandas is installed, you can convert a PyArrow schema to a Pandas-compatible schema:

from omop_schema.utils import pyarrow_to_pandas_schema
import pyarrow as pa

arrow_schema = pa.schema(
    [
        pa.field("column1", pa.int64()),
        pa.field("column2", pa.string()),
    ]
)

pandas_schema = pyarrow_to_pandas_schema(arrow_schema)
print(pandas_schema)

Optional Dependencies

  • Polars: For converting PyArrow schemas to Polars schemas.
  • Pandas: For converting PyArrow schemas to Pandas schemas.

Contributing

Contributions are welcome! Please open an issue or submit a pull request on the GitHub repository.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omop_schema-0.0.3.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omop_schema-0.0.3-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file omop_schema-0.0.3.tar.gz.

File metadata

  • Download URL: omop_schema-0.0.3.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for omop_schema-0.0.3.tar.gz
Algorithm Hash digest
SHA256 f9042b84930bd24141b5b61586a5c54184b1cc4edca44024cc47ce954aed3883
MD5 fce72c37bc1831c4b0069a78b8c04536
BLAKE2b-256 31564ad76dbbec4a5d3e3288dbf8a0fde2e9a33669ab8f4793373995a9b0f5ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for omop_schema-0.0.3.tar.gz:

Publisher: python-build.yaml on rvandewater/omop_schema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omop_schema-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: omop_schema-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for omop_schema-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9b38c50e48f51a00f286cd7c4f7ea24b96bb2c4e8f0ea9c78df3abeb4c01c9ea
MD5 39d20744ff27a336df1a63d4761a1743
BLAKE2b-256 5ccdcb9ce2fc55a4e786c96e3b2b562ae6cafb7884a36f24036b339ceda19555

See more details on using hashes here.

Provenance

The following attestation bundles were made for omop_schema-0.0.3-py3-none-any.whl:

Publisher: python-build.yaml on rvandewater/omop_schema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page