Skip to main content

Anonymization of data in pg_dump

Reason this release was yanked:

Bug in generation unique values

Project description

pg_stage

A utility for generating a database dump, the data in which will be obfuscated. This dump can be used in development and stage servers without fear of their theft.

Content

How does it work?

The utility processes the output of the pg_dump command line by line and decides whether to obfuscate data at the level of comments to a table or column.

Usage example

  1. You need to create a file with approximately the following contents:
# main.py
from pg_stage.obfuscator import Obfuscator


obfuscator = Obfuscator(locale='ru_RU')
obfuscator.run()
  1. Add comments to a column or table:
COMMENT ON COLUMN table_1.first_name IS 'anon: [{"mutation_name": "first_name"}]';
  1. Run pg_dump and redirect the stream to the running script process:
pg_dump -d database | python3 main.py > dump.sql
  1. After that you will get the obfuscated data in the table

Supported types of obfuscation

You can see the current list here.

Why did I write my utility?

I also adhere to the rule that you do not need to place third-party plugins in the working database for its security (most utilities are in the form of database extensions).

Also, in similar utilities, I could not find the functionality for uniform obfuscation of data in related tables. This prompted me to write my own utility that will be able to obfuscate data in related tables with the same result by a foreign key.

Example:

COMMENT ON COLUMN table_1.first_name IS 'anon: [{"mutation_name": "first_name", "relations": [{"table_name": "table_1", "column_name": "last_name", "from_column_name": "id", "to_column_name": "id"}]}]';

where relations - links on tables where it is necessary to obfuscate fields according to the current field.

Thanks for the inspiration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg_stage-0.3.2.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

pg_stage-0.3.2-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file pg_stage-0.3.2.tar.gz.

File metadata

  • Download URL: pg_stage-0.3.2.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for pg_stage-0.3.2.tar.gz
Algorithm Hash digest
SHA256 671145a3930a12168f0d15bda0b602a81710c0fe34260de3c3b85783417ead3e
MD5 5330b89f8a1a1dfd9c610f34137d9a9e
BLAKE2b-256 d970112b5cae778e130c27696462bc6efdedd25a90bd403ecf701aee391545ae

See more details on using hashes here.

File details

Details for the file pg_stage-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: pg_stage-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for pg_stage-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5be771ef9f75fdaa886f2b10c673d6a4ca1ba3a091ed2148d18231627186b544
MD5 78fad9dd70593b6e901da653d2a482e7
BLAKE2b-256 10e0f6066063c3cc3af3972e74c3ab6987317275e999298a3e1ca81a6e5d54e4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page