Skip to main content

redactdump

Project description

Logo

type-lint badge test badge DeepSource DeepSource

Easily create database dumps with support for redacting data (And replacing that data with valid random values).

Supported databases

  • MySQL
  • PostgreSQL

More coming soon...

Installation

To install redactdump, run the following command:

pip install redactdump

Usage

usage: redactdump [-h] -c CONFIG

redactdump

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        Path to dump configuration.
  -u USER, --user USER  Connection username.
  -p PASSWORD, --password PASSWORD
                        Connection password.
  -d DEBUG, --debug DEBUG
                        Enable debug mode.

Configuration

To create a dump you currently must use a configuration file, however in the future you might be able to do it all via CLI.

Supported replacement values.

redactdump uses faker to generate random data.

replacement can therefore be any function from the following providers: https://faker.readthedocs.io/en/stable/providers.html

NOTE: redactdump is currently NOT tested with all providers, some might trigger bugs

Example configuration:

connection:
  type: pgsql
  host: 127.0.0.1
  port: 5432
  database: postgres

redact:
  patterns:
    column:
      - pattern: '^[a-zA-Z]+_name'
        replacement: name
    data:
      - pattern: '192.168.0.1'
        replacement: ipv4
      - pattern: 'John Doe'
        replacement: name

output:
  type: multi_file
  naming: 'dump-[table_name]-[timestamp]' # Default: [table_name]-[timestamp]
  location: './output/'

Configuration Schema

The configuration schema can be found here

Example

Configuration
connection:
  type: pgsql
  host: 127.0.0.1
  port: 5432
  database: postgres

redact:
  patterns:
    column:
      - pattern: '^new_'
        replacement: name
    data:
      - pattern: '6'
        replacement: random_int

output:
  type: multi_file
  naming: 'dump-[table_name]-[timestamp]' # Default: [table_name]-[timestamp]
  location: './output/'
Original data

(column_1, new_column)

6,"""John Doe"""
6,"John Doe"
6,"John Doe"
6,John Doe
1,\John Doe
1,--John Doe
12312, John Doe
99,!John Doe
99,(John Doe)
Output
INSERT INTO table_name VALUES (890, 'Yolanda Mcdonald');
INSERT INTO table_name VALUES (1982, 'Stephen Lewis');
INSERT INTO table_name VALUES (2952, 'Janet Woodward');
INSERT INTO table_name VALUES (9307, 'Joshua Price');
INSERT INTO table_name VALUES (1, 'Tina Morrison');
INSERT INTO table_name VALUES (1, 'Juan Mejia');
INSERT INTO table_name VALUES (12312, 'Michael Thornton');
INSERT INTO table_name VALUES (99, 'Adrian White');
INSERT INTO table_name VALUES (99, 'Robin Jefferson');

Known limitations

Data types not supported

  • box
  • bytea
  • inet
  • interval
  • circle
  • cidr
  • line
  • lseg
  • macaddr
  • macaddr8
  • pg_lsn
  • pg_snapshot
  • point
  • polygon
  • tsquery
  • tsvector
  • txid_snapshot

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redactdump-0.4.0.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

redactdump-0.4.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file redactdump-0.4.0.tar.gz.

File metadata

  • Download URL: redactdump-0.4.0.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for redactdump-0.4.0.tar.gz
Algorithm Hash digest
SHA256 0f8d2db5d6b22485542cf6bde33d431772499545640c24214ef79f804346428b
MD5 1c8d150c48d241f6836bd1c6bacaf074
BLAKE2b-256 4309a891dd91957d7cc5e54d2b78d198c480a14451e395395f26b900fbdcf569

See more details on using hashes here.

File details

Details for the file redactdump-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: redactdump-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for redactdump-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ee15d16c30a47d1f00b9bd89deb3c2db59a4936cbb4b9fd6da823cc0e690826c
MD5 fadff10f5b3130f6cdaab72b0217c2e0
BLAKE2b-256 292ae91fd8d8343f7805f0cdf7210910f7afbff68e38988937d2290ba9f060e7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page