Skip to main content

Load/dump Django models from/to Excel 2007+ workbooks

Project description

django-xlsx-serializer

PyPI: Version PyPI: Python PyPI: Django PyPI: License

Pre-commit Python: CI Codecov

Nox Ruff Mypy Prettier Conventional Commits

Overview

django-xlsx-serializer is a Django application designed to handle the data serialization and deserialization between Django models and Microsoft Excel 2007+ workbooks. Utilizing the OpenPyXL engine, this tool provides robust methods to export data from Django databases into XLSX files and import data from the files back into the databases. This functionality is essential for applications that require data exchange between Django-based systems and Excel, facilitating such tasks as data migration, reporting, and backups.

Features

The app allows you to:

  • Export Django models from a database to an Excel workbook via the dumpdata command.
  • Populate databases from Excel fixtures using the loaddata command.
  • Interact with Excel workbooks (either files or openpyxl.Workbook objects) and the database using the Django's core serialization utilities.

Requirements

Python Django Database engines
3.9 3.2, 4.0, 4.1, 4.2 SQLite3, PostgreSQL
3.10 3.2, 4.0, 4.1, 4.2, 5.0, 5.1, 5.2 SQLite3, PostgreSQL
3.11 4.1, 4.2, 5.0, 5.1, 5.2 SQLite3, PostgreSQL
3.12 4.2, 5.0, 5.1, 5.2 SQLite3, PostgreSQL
3.13 5.1, 5.2 SQLite3, PostgreSQL

All setups require OpenPyXL < 4.

Installation

The fastest way to add the package to your Python environment is to download and install it directly from PyPI. Use pip:

pip install django-xlsx-serializer

or any other dependency manager of your preference.

As soon as the installation is completed, all the app's functionalities can be accessed from the xlsx_serializer module:

import xlsx_serializer

The app is compatible with Excel 2007+ XLSX workbooks only. Adding support for the older XLS format is not planned.

Django Configuration

The app utilities can be incorporated into your Django project by following one of the approaches listed below:

  1. Installing the package as an app.
  2. Adding the package to serialization modules.
  3. Registering the app's serializers module from another app.

All of them associate the app's serializer with the xlsx format.

Install as an App

In your project settings module add xlsx_serializers to INSTALLED_APPS:

INSTALLED_APPS = [
    # ...
    "xlsx_serializer",
    # ...
]

Add to Serialization Modules

In your project settings module update the SERIALIZATION_MODULES dictionary:

SERIALIZATION_MODULES = {
    # ...
    "xlsx": "xlsx_serializer",
    # ...
}

Register from Another App

In any of the apps installed in your projects (let us call it myapp), register the xlsx_serializer manually in the app's ready hook:

# myapp/apps.py

from django.apps import AppConfig
from django.core import serializers


class MyAppConfig(AppConfig):
    name = "myapp"

    def ready(self) -> None:
        super().ready()

        # ...

        # Register serializers.
        serializers.register_serializer("xlsx", "xlsx_serializer")

There are many Django projects using a "core" app for defining project-wide utilities (e.g., custom commands, template tags, etc.). The configuration class of such an app is a good place to apply the code snippet above.

Usage

Excel Workbooks vs. Django Models

The app adopts quite intuitive correspondence between Excel workbooks (i.e., the collections of worksheets) and Django models:

  • A Django model is represented by a single worksheet.
  • In an Excel workbook, the models are identified by worksheet names.
  • Within an Excel worksheet, model instances are represented by rows, while the columns correspond to the model's fields.

Serialization

Serialization can be run either by the built-in dumpdata Django management command:

python manage.py dumpdata --format xlsx --output dump.xlsx

or from Django interactive shell:

>>> from django.core import serializers
>>> from polls.models import Question
>>> serializers.serialize("xlsx", Question.objects.all(), output="dump.xlsx")
# Prints: <openpyxl.workbook.workbook.Workbook object at ...>

Both the command and expression shown above save dump.xlsx workbook file. The latter additionally returns an openpyxl.Workbook object, which can be used later if necessary (e.g., in development or maintenance scripts).

When serializing, the app creates worksheets named using fully qualified model labels. For example, the Question model defined in the polls app is serialized to the "polls.Question" worksheet. Excel does not accept worksheet names longer than 31 characters. If the model's label is longer, it's truncated. A useful feature allowing you to circumvent this issue is that the output worksheet names can be customized using the model_sheet_names option. So, the command:

>>> workbook = serialize(
        "xlsx",
        Question.objects.all(),
        model_sheet_names={"polls.Question": "Questions"},
    )
>>> workbook
# Prints: <openpyxl.workbook.workbook.Workbook object at ...>

results in the polls.Question model data serialized in the "Questions" worksheet. Note that this option is not available when using the app via the dumpdata command.

The app inspects each key and value of the model_sheet_names dictionary. For the keys, it validates whether they represent valid model identifiers. The values, in turn, are checked to see if they are unique, are not too long, and do not contain invalid characters (?, *, :, \, /, [, ]).

Other key points:

  • DateField, DateTimeField, and TimeField values are serialized as ISO 8601 strings.
  • JSONField values are serialized as JSON strings returned by the respective field's encoders.
  • ManyToManyField values are serialized as stringified lists of foreign keys.
  • The app supports serialization by using natural keys. If it is triggered (by applying the --natural-primary/--natural-foreign flags), the natural keys are serialized as stringified tuples (or their lists in the case of many-to-many relations).

Deserialization

The recommended way of employing the app to load the model data from an Excel fixture to the database is to call it via the loaddata command:

python manage.py loaddata fixture.xlsx

Deserialization requires the input workbook's worksheets to have names that are either the fully qualified labels or model names (case-insensitive). The latter can be applied if the model name is unique. For example, if the project uses models polls.Question and exams.Question, the worksheet named "Question" will not be deserialized.

Within a worksheet, ensure that the column headers correspond to the field names of the respective model. The app ignores a column if it does not represent a field. Empty rows and columns surrounding the data range are ignored as well. However, the app does not check the data for the missing or invalid values.

Other key points:

  • Populating DateField, DateTimeField, and TimeField with timezone support enabled in Django settings requires date/time values to be saved as ISO 8601 strings (date/time type values in Excel don't store timezone information).
  • Deserializing JSONfield requires values in a format compatible with the JSON decoder of the respective field.
  • In the case of ManyToManyField provide string representations of Python lists containing the primary (or natural, see the next bullet) keys of the related objects.
  • The app handles deserialization from natural keys by using ast.literal_eval. Make sure to provide the keys that are valid string representations of the corresponding values (i.e., tuples of primitive Python literals; in most cases, they are strings — if so, use single quotes as text delimiters).

Contributing

This is an open-source project that embraces contributions of all types. We require all contributors to adhere to our Code of Conduct. For comprehensive instructions on how to contribute to the project, please refer to our Contributing Guide.

Authors

Created and maintained by Kamil Paduszyński (@paduszyk).

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_xlsx_serializer-1.1.0.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

django_xlsx_serializer-1.1.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file django_xlsx_serializer-1.1.0.tar.gz.

File metadata

  • Download URL: django_xlsx_serializer-1.1.0.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for django_xlsx_serializer-1.1.0.tar.gz
Algorithm Hash digest
SHA256 ee61e32a21b58a4436f81d2379105dd3c9c077b5636734f54f9318a0ebb012b6
MD5 40f280ea2a02ebd0404270b43a304768
BLAKE2b-256 ae3602bfb79f26b3e7992214a6b64ecc715006a865d6b8e8e19c467e60abfd72

See more details on using hashes here.

Provenance

The following attestation bundles were made for django_xlsx_serializer-1.1.0.tar.gz:

Publisher: python-cd.yml on paduszyk/django-xlsx-serializer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file django_xlsx_serializer-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for django_xlsx_serializer-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fac7d05e623882c183239255c498f7336b55b17e95b4aa65ff20dcae29c7e15f
MD5 d153db619a115b925c5af27f3b4c7a7b
BLAKE2b-256 8054c70fe5b42d389be5ad33dc04fc906e4d56058ce3e1580051e25eda781591

See more details on using hashes here.

Provenance

The following attestation bundles were made for django_xlsx_serializer-1.1.0-py3-none-any.whl:

Publisher: python-cd.yml on paduszyk/django-xlsx-serializer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page