Skip to main content

A module for obfuscating a mysqldump file

Project description

Py_Obfuscate

Build Status

A module for obfuscating a mysqldump file

This project is a partial-port of My_Obfusicate. Under the hood it mostly uses Faker for generating fake data.

Example usage

This package exposes a py_obfuscate module which contains Obfuscator class with a very simple inteface. It's obfuscate method expects two streams: a read string (e.g. the mysqldump file) and write stream (e.g. the file to write the obfuscated dump to).

obfuscatator.obfuscate(streamIn, streamOut)

As a more practical example, create the file obfuscate.py

import sys
import yaml
import py_obfuscate

config = yaml.safe_load(open("obfuscator.yaml"))
obfuscatator = py_obfuscate.Obfuscator(config)

src = sys.stdin
out = sys.stdout

obfuscatator.obfuscate(src, out)

Now create a config file (obfuscate.yaml), e.g.:

tables:
  users:
    name:
      type: "name"
    email:
      type: "email"
    accountno:
      type: "string"
      chars: "1234567890"
      length: 10

You should change this config to reflect the tables and columns you wish to obfuscate.

Now you can run:

mysqldump -c --add-drop-table --hex-blob -u user -ppassword database | python obfuscate.py > obfuscated_dump.sql

Note that the -c option on mysqldump is required to use py_obfuscate. Additionally, the default behavior of mysqldump is to output special characters. This may cause trouble, so you can request hex-encoded blob content with –hex-blob. If you get MySQL errors due to very long lines, try some combination of –max_allowed_packet=128M, –single-transaction, –skip-extended-insert, and –quick.

Configuration

In the above example we've used YAML as the configuration format; since you pass py_obfuscate.Obfuscator a config object (dictionary) you can use any format you wish, so long as parses into the same structure. The basic structure is:

locale: <local string (optional): defaults "en_GB">
tables:
  <table>:
    truncate: <boolean - set to true to remove insert for this table. Defaults `false`>
    <column>:
      type: <type - how to obfusciate this column>
      <type-specific-option>: <type-specific-option-value> 

Tables or columns which are ommitted from the config are ignored. Currently no warning is given.

Locale

  • type: string
  • default: "en_GB"

This is the locale string passed to Faker.

Truncate

Setting truncate: true for a table will remove the insert from the mysqldump.

Types

These are the following types supported:

string

Options:

  • chars (string) The character list to choose from (defaults "1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_+-=[{]}/?|!@#$%^&*()``~")
  • length (integer) The length of the string (defaults 10)

fixed

Options:

  • value (string|array) Replace column entries with this value or one of the values in the specified array (defaults "")

integer

Options:

  • min (string) Replace column entries with a random integer greater than or equal to this value (defaults 0)
  • max (string) Replace column entries with a random integer less than or equal to this value (defaults 100)

email

name

first_name

last_name

username

address

street_address

secondary_address

city

postcode

company

ip

url

sortcode

bank_account

mobile

uk_landline

null

Unit testes

python -m unittest discover -s py_obfuscate

License

This work is provided under the MIT License. See the included LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_obfuscate-0.1.6.tar.gz (9.8 kB view hashes)

Uploaded Source

Built Distribution

py_obfuscate-0.1.6-py3-none-any.whl (10.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page