Skip to main content

Automate management of PII redacted schemas for dbt projects.

Project description

PyPI GitHub CI Codecov Supported Python versions License

The Schema Builder tool is used to create dbt schema files, sql models, and default PII / non-PII views for tables in the given Snowflake schemas.

For each specified application schema, the script will generate dbt models for a <SCHEMA> and <SCHEMA>_PII schema. We refer to these schemas as a “trifecta”.

  • <SCHEMA>_<RAW_SUFFIX> contains the original source tables.

  • <SCHEMA>_PII contains views on the _RAW tables that have un-redacted PII.

  • <SCHEMA> contains views on the _RAW tables sensitive data redacted.

Application schemas can be sourced from multiple raw schemas. This allows you to specify which tables should be pulled from which raw schema to construct the “trifecta”.

Schema Builder ensures that all three schemas provide the same interface to the data (number and order of columns match what is present in the _RAW schema).

Once the script is successfully run, you can execute a dbt run to create or update the views in <SCHEMA> and <SCHEMA>_PII. If your source data in the <SCHEMA>_<RAW_SUFFIX> schema changes you should run Schema Builder frequently to keep up with changes in the tables and columns stored there.

Schema Builder will also automatically create sources in one or more other dbt projects so that they can use the results of these models as sources.

See the docs for more information.

License

The code in this repository is licensed under the AGPL 3.0 unless otherwise noted.

Please see LICENSE.txt for details.

How To Contribute

Contributions are very welcome. Please read The Contribution Guide for details. Even though they were written with edx-platform in mind, the guidelines should be followed for all Open edX projects.

The pull request description template should be automatically applied if you are creating a pull request from GitHub. Otherwise you can find it at PULL_REQUEST_TEMPLATE.md.

The issue report template should be automatically applied if you are creating an issue on GitHub as well. Otherwise you can find it at ISSUE_TEMPLATE.md.

Reporting Security Issues

Please do not report security issues in public. Please email security@edx.org.

Getting Help

If you’re having trouble, we have discussion forums at https://discuss.openedx.org where you can connect with others in the community.

Our real-time conversations are on Slack. You can request a Slack invitation, then join our community Slack team.

For more information about these options, see the getting assistance page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt-schema-builder-0.5.0.tar.gz (36.9 kB view details)

Uploaded Source

Built Distribution

dbt_schema_builder-0.5.0-py2.py3-none-any.whl (32.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file dbt-schema-builder-0.5.0.tar.gz.

File metadata

  • Download URL: dbt-schema-builder-0.5.0.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for dbt-schema-builder-0.5.0.tar.gz
Algorithm Hash digest
SHA256 6b3893c81c7cef6da28bda0bf4cd499878a15e024a8220cd5afcd0707cdd781a
MD5 6e60e50161172225f1772765a08e4089
BLAKE2b-256 85011840b49874f810b7350b10fb773a9360028470dfcd8a6dee01e0867e2d2f

See more details on using hashes here.

File details

Details for the file dbt_schema_builder-0.5.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_schema_builder-0.5.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1ca866499b36be1cad7d9119b20851816428a9e6611dfb57891855fe8d3a362e
MD5 a566992c4c3d509c416ef10216974e60
BLAKE2b-256 ee219c60425fbfb1c5b7566bf8935a04d2f9e7550e6e096cfe8f3329496ce935

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page