Skip to main content
Help us improve PyPI by participating in user testing. All experience levels needed!

Sanitizes contents of a database.

Project description

Database sanitation tool

pypi travis codecov

database-sanitizer is a tool which retrieves an database dump from relational database and performs sanitation on the retrieved data according to rules defined in a configuration file. Currently the sanitation tool supports both PostgreSQL and MySQL databases.

Installation

database-sanitizer can be installed from PyPI with pip like this:

$ pip install database-sanitizer

If you are using MySQL, you need to install the package like this instead, so that additional requirements are included:

$ pip install database-sanitizer[MySQL]

Usage

Once the package has been installed, database-sanitizer can be used like this:

$ database-sanitizer <DATABASE-URL>

Command line argument DATABASE-URL needs to be provided so the tool knows how to retrieve the dump from the database. With PostgreSQL, it would be something like this:

$ database-sanitizer postgres://user:password@host/database

However, unless an configuration file is provided, no sanitation will be performed on the retrieved database dump, which leads us to the next section which will be...

Configuration

Rules for the sanitation can be given in a configuration file written in YAML. Path to the configuration file is then given to the command line utility with --config argument (-c for shorthand) like this:

$ database-sanitizer -c config.yml postgres://user:password@host/database

The configuration file uses following kind of syntax:

config:
  addons:
    - some.other.package
    - yet.another.package
strategy:
  user:
    first_name: name.first_name
    last_name: name.last_name
    secret_key: string.empty

In the example configuration above, there are first listed two "addon packages", which are names of Python packages where the sanitizer will be looking for sanitizer functions. They are completely optional and can be omitted, in which case only sanitizer functions defined in package called sanitizers and built-in sanitizers will be used instead.

The strategy portion of the configuration contains the actual sanitation rules. First you define name of the database table (in the example that would be user) followed by column names in that table which each one mapped to sanitation function name. The name of the sanitation function consists from two parts separated from each other by a dot: Python module name and name of the actual function, which will be prefixed with sanitize_, so name.first_name would be a function called sanitize_first_name in a file called name.py.

Project details


Release history Release notifications

This version
History Node

0.3.0

History Node

0.2.0

History Node

0.1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
database_sanitizer-0.3.0-py2.py3-none-any.whl (23.8 kB) Copy SHA256 hash SHA256 Wheel py2.py3 May 29, 2018
database-sanitizer-0.3.0.tar.gz (17.6 kB) Copy SHA256 hash SHA256 Source None May 29, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page