Skip to main content

A django add-on that allows models to be decorated with information about which fields contain sensitive information, and an associated management command that creates a script to remove that information.

Project description

A django add-on that allows models to be decorated with information about which fields contain sensitive information, and an associated management command that creates a script to remove that information.

https://travis-ci.org/MatthewWilkes/django-scrub-pii.svg?branch=master https://coveralls.io/repos/github/MatthewWilkes/django-scrub-pii/badge.svg?branch=master

INSTALL

$ pip install django-scrub-pii

USAGE

Add scrubpii to your settings file:

INSTALLED_APPS = (
    ...,
    ...,
    ...,
    'scrubpii',
)

Sensitive fields are marked by adding a sensitive_fields list to the model’s Meta class. As the fields in the Meta class are fixed, Django needs to be patched to allow the new field. To ensure isolation and warn if compatibility problems happen in future, this is achieved by defining the model within a context manager:

from scrubpii import allow_sensitive_fields

with allow_sensitive_fields():
    class Person(models.Model):
        first_name = models.CharField(max_length=30)
        last_name = models.CharField(max_length=30)
        date_of_birth = models.DateField()
        email = models.EmailField()

        def __unicode__(self):
            return "{0} {1}".format(self.first_name, self.last_name)

        class Meta:
            sensitive_fields = {'last_name', 'first_name', 'email', 'date_of_birth'}

This can be achieved easily by separating the sensitive models out into a new file, as so:

from django.db import models
from scrubpii import allow_sensitive_fields

with allow_sensitive_fields():
    from .sensitive_models import *

where sensitive_models.py is:

from django.db import models

__all__ = ['Person']

class Person(models.Model):
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)
    date_of_birth = models.DateField()
    email = models.EmailField()

    def __unicode__(self):
        return "{0} {1}".format(self.first_name, self.last_name)

    class Meta:
        sensitive_fields = {'last_name', 'first_name', 'email', 'date_of_birth'}

Once the sensitive fields are defined a management command will generate SQL statements to anonymize a database. This app will not anonymize the database directly to avoid the risk of damaging live data.

The script can be generated by running the management command:

$ python manage.py get_sensitive_data_removal_script > scrub.sql

The suggested workflow is:

  1. Dump database

  2. Reload dump into a temporary database on a secure server (or copy sqlite.db if sqlite)

  3. Generate anonymisation script

  4. Run anonymisation script against temporary database

  5. Dump temporary database

  6. Delete temporary database

  7. Transmit temporary database to insecure server

SUPPORTED DATABASES

Currently, postgresql and sqlite only are supported. Patches to add other databases or fields welcome.

Note, the anonymisation under sqlite is more comprehensive than under postgresql. For example, under sqlite IP addresses will be anonymised to the same value, whereas under postgres different IPs will be anonymised to differing values.

DEVELOP

$ git clone django-scrub-pii $ cd django-scrub-pii $ make

RUNNING TESTS

$ tox

Changelog

1.0 (2016-01-29)

  • Initial release, basic support for built in field types, especially on postgres. Limited sqlite support. [MatthewWilkes]

django-scrub-pii Copyright (c) 2016, Matthew Wilkes
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
   derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django-scrub-pii-1.0.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

django_scrub_pii-1.0-py2.py3-none-any.whl (21.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file django-scrub-pii-1.0.tar.gz.

File metadata

File hashes

Hashes for django-scrub-pii-1.0.tar.gz
Algorithm Hash digest
SHA256 856d5f475483afc6160e94d2e73615ce377a4a3dd020e48570e1e14be2b89a04
MD5 ee0e211f6d493bbde3b0c60b577e348d
BLAKE2b-256 84aef6e0500579525758538a076428c7e51d12830a5d7c880796f7bcf531a85e

See more details on using hashes here.

File details

Details for the file django_scrub_pii-1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for django_scrub_pii-1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 6cd9e4be6bf1af44c447e96e63cc5f6214596d475585e33f2aa676ccc36c157a
MD5 6772c32c874c184600613e776990b302
BLAKE2b-256 056b55f9460ef9b38b6e803e0cc9419f3b06db333ffc55c8c73662e40cb47fd5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page