Skip to main content

leukeleu-django-gdpr is a tool to generate GDPR documentation based on the project's models.

Project description

leukeleu-django-gdpr

Installation

pip install leukeleu-django-gdpr

Add to INSTALLED_APPS:

INSTALLED_APPS = [
    # ...
    'leukeleu_django_gdpr',
    # ...
]

Configuration

By default, the gdpr management command will write gdpr.yml to settings.BASE_DIR.

To change the output directory (without changing settings.BASE_DIR) add DJANGO_GDPR_YML_DIR to your settings:

DJANGO_GDPR_YML_DIR = os.path.join(BASE_DIR, 'docs')

Usage:

On first run, leukeleu-django-gdpr will generate a gdpr.yml file with a models list. This is a list of models in your project, each containing a list of fields.

./manage.py gdpr

A file gdpr.yml is created in the project root directory. It should be added to version control. Each model in the models list has the following structure:

models:
  auth.User:
    name: User
    fields:
      username:
        name: Username
        description: String (up to %(max_length)s)
        help_text: Required. 150 characters or fewer. Letters, digits and @/./+/-/_
          only.
        required: true
        pii: null
      first_name:
        name: First Name
        description: String (up to %(max_length)s)
        help_text: ''
        required: false
        pii: null

Leukeleu-django-gdpr adds the pii: null to all fields. The objective is to replace all those null values with the correct boolean value; pii: true if the field represents PII data, pii: false otherwise.

When run again, leukeleu-django-gdpr will persist those values, allowing you to work your way to eliminating all pii: nulls.

Leukeleu-django-gdpr outputs counts of the pii: values when run:

./manage.py gdpr
Checking...
No PII set     48
PII True       1
PII False      0

Run with --check to make the command exit with exit code 1 if 'No PII set' > 0 (the yaml file will still be generated/updated).

You can prevent leukeleu-django-gdpr from writing (back) to the yaml file by running with the --dry-run flag.

Excluding/including

To exclude apps, models or fields from this process altogether, list them in the exclude: list in the yaml file. Each item is a regex which should match an object's string representation in the following formats;

  • for apps: the app's label, e.g. auth.
  • for models: the model's label, e.g. auth.Permission
  • for fields: the model's label followed by . followed by the field's name, e.g. auth.User.username.

Keep in mind that the items in the list are considered to be regexes which should fully match the object's string representation.

Default excludes

By default, leukeleu-django-gdpr excludes fields of the following types:

  • AutoField
  • UUIDField
  • BooleanField
  • RelatedField

and the following apps:

  • django.contrib.admin
  • django.contrib.contenttypes

If you still want to include a field/model that would be excluded this way, you can put an item in the include: list in the yaml file:

include:
- clients\.Client\.external_epd_uuid
- accounts\.Profile\.is_pregnant
- admin\.LogEntry

Proxy models are always excluded. They are the same as the model they proxy, so there is no benefit in including them.

Anonymizing data

Leukeleu-django-gdpr comes with a anonymize management command, that anonymizes all PII fields in the database.

It is meant to be used in development only. It requires an additional dependency and setting DEBUG = True.

pip install leukeleu-django-gdpr[anonymize]
./manage.py anonymize

This command uses the gdpr.yaml file to anonymize all fields marked as PII in the database.

To change the configuration, you can create a subclass of BaseAnonymizer:

Note that anonymizer functions should conform to the leukeleu_django_gdpr.anonmymize.AnonymizerFunction protocol.

# some_file.py

fake = Faker(["nl-NL"])

def custom_anonymizer(obj: Model, field: Field):
    return getattr(obj, field.name)[:4]

class Anonymizer(BaseAnonymizer):
    # Exclude rows
    # Default: superusers and staff users are excluded
    extra_qs_overrides = {
        "app.Model": Model._base_manager.exclude(some_field=...),
        ...
    }

    # Specify fake data for a field
    # Default: user's first_name and last_name are filled with random first/last names
    extra_field_overrides = {
        "app.Model.some_field": lambda obj, field: fake.word(),
        "app.Model.some_other_field": lambda obj, field: "same value for every cell",
        ...
    }
    
    # Specify the fake data used for a field type
    # Use for custom fields or to overwrite defaults
    # Default: django builtin fields have "sensible" defaults
    extra_fieldtype_overrides = {
        "CustomPhoneNumberField": lambda obj, field: fake.phone_number(),
      
        # Also specify a unique variant (append with ".unique")
        "CustomPhoneNumberField.unique": lambda obj, field: fake.unique.phone_number(),

        # You can also use full custom functions for more complex behaviour
        "ImageHash": custom_anonymizer,
        ...
    }

    # Exclude fields
    # Default: no fields are excluded
    excluded_fields = [
        "app.SomeModel.some_field",
        ...
    ]

Then add this setting to your settings file:

DJANGO_GDPR_ANONYMIZER_CLASS = "location.to.custom.Anonymizer"

Checks

Leukeleu-django-gdpr adds a gdpr.I001 check to the check command. This check will fail if there are any pii: null values in the yaml file. To run the check, run:

./manage.py check

CI/CD

Run the check command to make a (scheduled) CI/CD task fail if there are unclassified fields, which can happen if someone adds a field to a model but forgets to classify it in the gdpr.yml.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leukeleu_django_gdpr-2.0.0.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

leukeleu_django_gdpr-2.0.0-py3-none-any.whl (43.9 kB view details)

Uploaded Python 3

File details

Details for the file leukeleu_django_gdpr-2.0.0.tar.gz.

File metadata

  • Download URL: leukeleu_django_gdpr-2.0.0.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for leukeleu_django_gdpr-2.0.0.tar.gz
Algorithm Hash digest
SHA256 6166f83b14433358946802905fbef2b21f8accff3dc5fbb7342545df4066874f
MD5 2fb5ba9a383c5a46fb67ac52621070d4
BLAKE2b-256 2d2767f48c87b3982ba48a7169c0af606b9688b3ae6b1b8aefe093ca5c57a4fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for leukeleu_django_gdpr-2.0.0.tar.gz:

Publisher: release.yml on leukeleu/leukeleu-django-gdpr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file leukeleu_django_gdpr-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for leukeleu_django_gdpr-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10c2a328475bbaff8a545522f30f1b4eb8eb98b8b0eac8cd973258848fde6fe7
MD5 de1c7e4fc88c97f1a1ee72806f970382
BLAKE2b-256 b657742bce786f2a5813439b3813b5b1529c341f27e7cb270795353e01b3f58c

See more details on using hashes here.

Provenance

The following attestation bundles were made for leukeleu_django_gdpr-2.0.0-py3-none-any.whl:

Publisher: release.yml on leukeleu/leukeleu-django-gdpr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page