Get an OMOP CDM database running quickly.
Project description
omop-lite
A small container to get an OMOP CDM database running quickly, with support for both PostgreSQL and SQL Server.
Drop your data into data/, and run the container.
Configuration
You can configure the container or CLI using the following environment variables:
DB_HOST: The hostname of the database. Default isdb.DB_PORT: The port number of the database. Default is5432.DB_USER: The username for the database. Default ispostgres.DB_PASSWORD: The password for the database. Default ispassword.DB_NAME: The name of the database. Default isomop.DIALECT: The type of database to use. Default ispostgresql, but can also bemssql.OMOP_VERISON: Version of the OMOP-CDM schema to load. Default isomop5_4, but can also beomop5_3.SCHEMA_NAME: The name of the schema to be created/used in the database. Default ispublic.DATA_DIR: The directory containing the data CSV files. Default isdata.SYNTHETIC: Load synthetic data (boolean). Default isfalseSYNTHETIC_NUMBER: Size of synthetic data,100or1000. Default is100.DELIMITER: The delimiter used to separate data. Default istab, can also be,
Usage
CLI
pip install omop-lite
python omop-lite --help
Docker
docker run -v ./data:/data ghcr.io/health-informatics-uon/omop-lite
# docker-compose.yml
services:
omop-lite:
image: ghcr.io/health-informatics-uon/omop-lite
volumes:
- ./data:/data
depends_on:
- db
db:
image: postgres:latest
environment:
- POSTGRES_DB=omop
- POSTGRES_PASSWORD=password
ports:
- "5432:5432"
Helm
To install using Helm:
# Add the Helm repository
helm install omop-lite oci://ghcr.io/health-informatics-uon/charts/omop-lite --version 0.2.2
The Helm chart deploys OMOP Lite as a Kubernetes Job that creates an OMOP CDM in a database. You can customise the installation using a values file:
# values.yaml
env:
dbHost: postgres
dbPort: "5432"
dbUser: postgres
dbPassword: postgres
dbName: omop_helm
dialect: postgresql
schemaName: public
synthetic: "false"
Install with custom values:
helm install omop-lite omop-lite/omop-lite -f values.yaml
Synthetic Data
If you need synthetic data, some is provided in the synthetic directory. It provides a small amount of data to load quickly.
To load the synthetic data, run the container with the SYNTHETIC environment variable set to true.
- 100 is fake data
- 1000 is Synthea 1k data.
- 1001 is Synthea 1k data but with Specimen, Death, Device Exposure added in
Bring Your Own Data
You can provide your own data for loading into the tables by placing your files in the data/ directory. This should contain .csv files matching the data tables (DRUG_STRENGTH.csv, CONCEPT.csv, etc.).
To match the vocabulary files from Athena, this data should be tab-separated, but as a .csv file extension.
You can override the delimiter with DELIMITER configuration.
Text search OMOP
Full-text search
Adding a tsvector column to the concept table and an index on that column makes full-text search queries on the concept table run much faster.
Vector search
Postgres does vector search too!
Enabling text search
To enable these features in omop-lite, you can use the text-search profile
docker compose --profile text-search up
To do this, you need to have text-search/embeddings.parquet, containing concept_ids and embeddings (an example file is provided).
This uses pgvector to create an embeddings table.
Testing
If you're a developer and want to iterate on omop-lite quickly, there's a small subset of the vocabularies sufficient to build in synthetic/.
If you wish to test the vector search, there are matching embeddings in embeddings/embeddings.parquet.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omop_lite-0.6.3.tar.gz.
File metadata
- Download URL: omop_lite-0.6.3.tar.gz
- Upload date:
- Size: 9.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eacc8b7275c70ca57663049b999f20d55debf71eca2370fe5298a86bcf578179
|
|
| MD5 |
8e0c1f3a94ebb4a59686610a4582609a
|
|
| BLAKE2b-256 |
dfbc1b01510c050d45957cb9241ea3564794c6a45f0a968e6cb36c90433abd3d
|
Provenance
The following attestation bundles were made for omop_lite-0.6.3.tar.gz:
Publisher:
release.yml on Health-Informatics-UoN/omop-lite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omop_lite-0.6.3.tar.gz -
Subject digest:
eacc8b7275c70ca57663049b999f20d55debf71eca2370fe5298a86bcf578179 - Sigstore transparency entry: 1572239521
- Sigstore integration time:
-
Permalink:
Health-Informatics-UoN/omop-lite@acccce6dc9523fab517484de77b8eaaefad6298a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Health-Informatics-UoN
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@acccce6dc9523fab517484de77b8eaaefad6298a -
Trigger Event:
push
-
Statement type:
File details
Details for the file omop_lite-0.6.3-py3-none-any.whl.
File metadata
- Download URL: omop_lite-0.6.3-py3-none-any.whl
- Upload date:
- Size: 9.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffa6b02b00d342d8a70a15ef66ef54beac9e856c8414752066dd69841c27859d
|
|
| MD5 |
3d5a0c1dcc0cbf25eaa4022c2718e7e5
|
|
| BLAKE2b-256 |
88af0adcf9e4d17a5c3ffd2ee0e6a56d8a97b0b8a0ad6c4f92a38ae707e24009
|
Provenance
The following attestation bundles were made for omop_lite-0.6.3-py3-none-any.whl:
Publisher:
release.yml on Health-Informatics-UoN/omop-lite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omop_lite-0.6.3-py3-none-any.whl -
Subject digest:
ffa6b02b00d342d8a70a15ef66ef54beac9e856c8414752066dd69841c27859d - Sigstore transparency entry: 1572239548
- Sigstore integration time:
-
Permalink:
Health-Informatics-UoN/omop-lite@acccce6dc9523fab517484de77b8eaaefad6298a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Health-Informatics-UoN
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@acccce6dc9523fab517484de77b8eaaefad6298a -
Trigger Event:
push
-
Statement type: