Turns FHIR data into de-identified & aggregated records
Project description
Cumulus ETL
Cumulus is an entire healthcare pipeline for population-scale clinical investigations.
Cumulus ETL is the first critical piece of that pipeline.
- It extracts bulk patient data from your EHR.
- It transforms that data by anonymizing it and running NLP on clinical notes
- It loads that data onto the cloud to be queried by Cumulus Library SQL
Documentation
For guides on installing & using Cumulus ETL, read our documentation.
Example
A simple run of Cumulus ETL might look something like:
docker compose run \
cumulus-etl \
s3://my-input-bucket/bulk-export/ \
s3://my-output-bucket/delta-lakes/ \
s3://my-phi-bucket/build-and-phi-artifacts/
This line would read ndjson files from the input bucket, drop the result as Delta Lakes into the output bucket, and save some bookkeeping configuration to a build/phi bucket.
Contributing
We love 💖 contributions!
If you have a good suggestion 💡 or found a bug 🐛, read our brief contributors guide for pointers to filing issues and what to expect.
If you're a programmer ⌨ and are looking for a starting place to help, we keep a list of good bite-size issues for first-time contributions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cumulus_etl-3.8.2.tar.gz.
File metadata
- Download URL: cumulus_etl-3.8.2.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1cb643384314555ab379f0ede161171efe326d03b927e304ca21b69d2a916db
|
|
| MD5 |
9046ab83a4bc61a345ff5da43674fd5c
|
|
| BLAKE2b-256 |
cdb7d8ba9a925edd8bf583ba704288d592b19ca064a7582496d1079395c0b1b2
|
Provenance
The following attestation bundles were made for cumulus_etl-3.8.2.tar.gz:
Publisher:
pypi.yaml on smart-on-fhir/cumulus-etl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cumulus_etl-3.8.2.tar.gz -
Subject digest:
e1cb643384314555ab379f0ede161171efe326d03b927e304ca21b69d2a916db - Sigstore transparency entry: 663120078
- Sigstore integration time:
-
Permalink:
smart-on-fhir/cumulus-etl@70c83a919420c02d95d637043322186bb7c0b84b -
Branch / Tag:
refs/tags/v3.8.2 - Owner: https://github.com/smart-on-fhir
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yaml@70c83a919420c02d95d637043322186bb7c0b84b -
Trigger Event:
release
-
Statement type:
File details
Details for the file cumulus_etl-3.8.2-py3-none-any.whl.
File metadata
- Download URL: cumulus_etl-3.8.2-py3-none-any.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
752ce5716a4dfc786e4be87394010d3261fcb2875d2d7bdf9213f6c8abde793e
|
|
| MD5 |
3084e2bffc7dba2a1786b373a5b42687
|
|
| BLAKE2b-256 |
0316bfc638df7ad7c2b44237c21bc78dad7086e4dd8d87473713d50bf2e0e143
|
Provenance
The following attestation bundles were made for cumulus_etl-3.8.2-py3-none-any.whl:
Publisher:
pypi.yaml on smart-on-fhir/cumulus-etl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cumulus_etl-3.8.2-py3-none-any.whl -
Subject digest:
752ce5716a4dfc786e4be87394010d3261fcb2875d2d7bdf9213f6c8abde793e - Sigstore transparency entry: 663120087
- Sigstore integration time:
-
Permalink:
smart-on-fhir/cumulus-etl@70c83a919420c02d95d637043322186bb7c0b84b -
Branch / Tag:
refs/tags/v3.8.2 - Owner: https://github.com/smart-on-fhir
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yaml@70c83a919420c02d95d637043322186bb7c0b84b -
Trigger Event:
release
-
Statement type: