Skip to main content

Get an OMOP CDM database running quickly.

Project description

omop-lite

MIT License omop-lite Releases omop-lite Tests Python omop-lite Containers omop-lite helm

A small container to get an OMOP CDM database running quickly, with support for both PostgreSQL and SQL Server.

Drop your data into data/, and run the container.

Configuration

You can configure the container or CLI using the following environment variables:

  • DB_HOST: The hostname of the database. Default is db.
  • DB_PORT: The port number of the database. Default is 5432.
  • DB_USER: The username for the database. Default is postgres.
  • DB_PASSWORD: The password for the database. Default is password.
  • DB_NAME: The name of the database. Default is omop.
  • DIALECT: The type of database to use. Default is postgresql, but can also be mssql.
  • OMOP_VERISON: Version of the OMOP-CDM schema to load. Default is omop5_4, but can also be omop5_3.
  • SCHEMA_NAME: The name of the schema to be created/used in the database. Default is public.
  • DATA_DIR: The directory containing the data CSV files. Default is data.
  • SYNTHETIC: Load synthetic data (boolean). Default is false
  • SYNTHETIC_NUMBER: Size of synthetic data, 100 or 1000. Default is 100.
  • DELIMITER: The delimiter used to separate data. Default is tab, can also be ,

Usage

CLI

pip install omop-lite python omop-lite --help

Docker

docker run -v ./data:/data ghcr.io/health-informatics-uon/omop-lite

# docker-compose.yml
services:
  omop-lite:
    image: ghcr.io/health-informatics-uon/omop-lite
    volumes:
      - ./data:/data
    depends_on:
      - db

  db:
    image: postgres:latest
    environment:
      - POSTGRES_DB=omop
      - POSTGRES_PASSWORD=password
    ports:
      - "5432:5432"

Helm

To install using Helm:

# Add the Helm repository
helm install omop-lite oci://ghcr.io/health-informatics-uon/charts/omop-lite --version 0.2.2

The Helm chart deploys OMOP Lite as a Kubernetes Job that creates an OMOP CDM in a database. You can customise the installation using a values file:

# values.yaml
env:
  dbHost: postgres
  dbPort: "5432"
  dbUser: postgres
  dbPassword: postgres
  dbName: omop_helm
  dialect: postgresql
  schemaName: public
  synthetic: "false" 

Install with custom values:

helm install omop-lite omop-lite/omop-lite -f values.yaml

Synthetic Data

If you need synthetic data, some is provided in the synthetic directory. It provides a small amount of data to load quickly. To load the synthetic data, run the container with the SYNTHETIC environment variable set to true.

  • 100 is fake data
  • 1000 is Synthea 1k data.
  • 1001 is Synthea 1k data but with Specimen, Death, Device Exposure added in

Bring Your Own Data

You can provide your own data for loading into the tables by placing your files in the data/ directory. This should contain .csv files matching the data tables (DRUG_STRENGTH.csv, CONCEPT.csv, etc.).

To match the vocabulary files from Athena, this data should be tab-separated, but as a .csv file extension. You can override the delimiter with DELIMITER configuration.

Text search OMOP

Full-text search

Adding a tsvector column to the concept table and an index on that column makes full-text search queries on the concept table run much faster.

Vector search

Postgres does vector search too!

Enabling text search

To enable these features in omop-lite, you can use the text-search profile

docker compose --profile text-search up

To do this, you need to have text-search/embeddings.parquet, containing concept_ids and embeddings (an example file is provided). This uses pgvector to create an embeddings table.

Testing

If you're a developer and want to iterate on omop-lite quickly, there's a small subset of the vocabularies sufficient to build in synthetic/. If you wish to test the vector search, there are matching embeddings in embeddings/embeddings.parquet.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omop_lite-0.6.3.tar.gz (9.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omop_lite-0.6.3-py3-none-any.whl (9.7 MB view details)

Uploaded Python 3

File details

Details for the file omop_lite-0.6.3.tar.gz.

File metadata

  • Download URL: omop_lite-0.6.3.tar.gz
  • Upload date:
  • Size: 9.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for omop_lite-0.6.3.tar.gz
Algorithm Hash digest
SHA256 eacc8b7275c70ca57663049b999f20d55debf71eca2370fe5298a86bcf578179
MD5 8e0c1f3a94ebb4a59686610a4582609a
BLAKE2b-256 dfbc1b01510c050d45957cb9241ea3564794c6a45f0a968e6cb36c90433abd3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for omop_lite-0.6.3.tar.gz:

Publisher: release.yml on Health-Informatics-UoN/omop-lite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omop_lite-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: omop_lite-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 9.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for omop_lite-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ffa6b02b00d342d8a70a15ef66ef54beac9e856c8414752066dd69841c27859d
MD5 3d5a0c1dcc0cbf25eaa4022c2718e7e5
BLAKE2b-256 88af0adcf9e4d17a5c3ffd2ee0e6a56d8a97b0b8a0ad6c4f92a38ae707e24009

See more details on using hashes here.

Provenance

The following attestation bundles were made for omop_lite-0.6.3-py3-none-any.whl:

Publisher: release.yml on Health-Informatics-UoN/omop-lite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page