Skip to main content

Import from and export to XLSX files with data validation.

Project description

django-spreadsheet

Validate your XLSX spreadsheet file before importing it. Export your data to a formatted XLSX spreadsheet with data validation set up.

Requirements

  • Python 3.8+
  • Django 2.0+

Installation

  • run pip install django_spreadsheets
  • add django_spreadsheets.apps.SpreadsheetAppConfig to your INSTALLED_APPS
  • add path("admin/spreadsheets/", include("django_spreadsheets.urls", namespace="spreadsheets")) to your root urls.py

Usage

Start to create a file spreadsheets.py in your app. Inside this file, describe your spreadsheets configurations such as the following example:

from django_spreadsheets.base_config import SpreadsheetConfig, SheetConfig
from django_spreadsheets.registry import register
from libs.bananas.models import Banana


class BananaSheetConfig(SheetConfig):
    """Describes each columns of this sheet."""

    ID = {
        "name": "Identifier",
        "required": True,
    }
    COLOR = {
        "name": "Color",
        "validation": {
            "type": "list",
            "formula1": ["yellow", "green", "brown"],
        },
        "required": True,
    }
    SIZE = {
      "name": "Size",
      "validation": {
          "type": "whole",
          "operator": "greaterThan",
          "formula1": 0,
          "error": "Please enter a positive number",
      }
    }

    def get_data_to_export(self, request):
        return Banana.objects.values_list("id", "color", "size")


@register
class BananaSpreadsheetConfig(SpreadsheetConfig):
    sheets_config = {
        "Banana": BananaSheetConfig()
        # the key will be used as the name of the sheet, the value is a class describing the sheet's configuration.
    }

    class Meta:
        verbose_name = "Bananas"

    def update_database(self, request, sheets):
        for sheet_name, rows in sheets.items():
            for row in rows:
                Banana.objects.get_or_create(
                  id=row["ID"],
                  color=row["COLOR"],
                  size=row["SIZE"],
                )

Then, you can go to //localhost:8000/admin/spreadsheets/export/ to export a XSLX file and //localhost:8000/admin/spreadsheets/import/ to import into your database.

A note on urls

Here are all the urls of the app that you can use:

from django.urls import reverse

# Import form (select config, click next to select file)
import_index = reverse("spreadsheets:import-index")
# Import form (select file)
import_all_objects = reverse("spreadsheets:import-all-objects") + "?config=BananaSpreadsheetConfig"
# Import template (get example file with inbuilt validation and column names)
import_template = reverse("spreadsheets:import-template")

# Export form (select config, click next to export file)
export_index = reverse("spreadsheets:export-index")
# Export file
export_all_objects = reverse("spreadsheets:export-all-objects") + "?config=BananaSpreadsheetConfig"

Faster exports

If you only need to export a huge amount of data, you can use django_spreadsheets.base_config.CSVConfig and export your data in a csv file. It will be faster than the xlsx export, but you'll be limited to one sheet only, and you won't be able to import data.

Example:

from django_spreadsheets.base_config import SheetConfig, CSVConfig
from django_spreadsheets.registry import register
from libs.bananas.models import Banana


class BananaSheetConfig(SheetConfig):
    """Column configuration for the communities sheet."""

    id = {
        "name": "Id",
        "required": True,
        "comment": "",
    }

    color = {
        "name": "Color",
        "required": True,
        "comment": "",
    }

    size = {
        "name": "Size",
        "required": True,
        "comment": "",
    }

    def get_data_to_export(self, request):
        return Banana.objects.all().values_list(
            "id",
            "color",
            "size",
        )


@register
class BananaExportOnlySpreadsheetConfig(CSVConfig):
    sheets_config = {
        "Bananas": InvoiceDeliveryPointSheetConfig(),
    }

    class Meta:
        verbose_name = "Bananas (fast)"

    def update_database(self, request, sheets):
        return None

SpreadsheetConfig class

Inherit this class to describe your spreadsheets' configuration. Define a sheets_config dict attribute containing the name of your sheet as a key, and a SheetConfig class as a value.

Validations

Several validations are run before importing a spreadsheet:

Config version validation

Validate that the configuration version stored in the file is the same as the SpreadsheetConfig.version used.

Sheets validation

  • Validate that all sheets in the selected SpreadsheetConfig class, exist in imported file.
  • Validate that all sheets in imported file, exist in the selected SpreadsheetConfig class.

Columns headers validation

Validate that the imported file contains the same columns headers as described in the SheetConfig used.

Required data validation

Validate that there is a value in the imported file for each required, required_unless, required_if columns.

Uniqueness validation [/!\ not implemented]

Validate that multiple cells in a row are unique together in the imported file.

Types validation

Validate that each column in the imported file contains data of correct type (date, positive integer, ...).

Data match validation

Validate that we find an object in the database matching the value in the imported file. If not, we ask the user to chose from a list of possibles values.

Manual import validation

After automatic validations (and after unknown values have been matched when needed), the wizard runs optional business rules defined on your SpreadsheetConfig class.

Override validate_custom_rules(self, request, sheets) and return a list of errors. The default implementation returns an empty list.

Example:

from django_spreadsheets.errors import ImporterError

def validate_custom_rules(self, request, sheets):
    errors = []
    for row_index, row in enumerate(sheets.get("Sheet name", []), start=2):
        if problem_in_this_row(request, row):
            errors.append(
                ImporterRowError(
                    sheet_name=row_index,
                    row=row,
                    title="There is a problem in this row",
                )
            )
    return errors

The errors returned can be anything, but must:

  • return a title

You are encouraged to create your own errors (you can inherit from ImporterCellMixin or ImporterSheetMixin). These mixins and some already-defined errors (like ImporterCellError and ImporterRowError) are livin in errors.py.

You may keep extra checks in update_database() during a migration, but the recommended approach is to consolidate access and business rules in validate_custom_rules() only.

Import data

If you want to import data in y our database from this spreadsheet, you need to define a update_database(self, request, sheets) method. This method is called after all data validations have been completed, and receive a python dict with an item for each sheet. The key is the name of the sheet, and the value is a python list of rows. Each row is a list containing a value for each column.

def update_database(self, request, sheets):
    for sheet_name, rows in sheets.items():
        for row in rows:
            for cell_value in row:
                print(cell_value)

SheetConfig class

Inherit this class to describe your sheets' configuration. Each attribute describes a column using a dictionary such as the following:

<column_identifier> = {
  "name": str (required),  # The name of the column used in the header row
  "required": bool,  # Whether a value in the column is required when importing
  "required_unless": str,  # Whether a value in this column is required if the pointed column is empty
  "required_if": str,  # Whether a value in this column is required if the pointed column is filled
  "comment": str,  # A comment that appears in the header cell
  "admin_only": bool,  # Whether this column should be exported only when the current user is an administrator
  "validation": {  # A data validation applied to each cell of this column
    "name": str,  # A name used as header in the hidden config sheet
    "type": str (required),  # <url|date|list|range|whole...> See openpyxl documentation to get all available validation types
    ...  # Same args as openpyxl's DataValidation objects (operator, formula1, formula2, showErrorMessage, ...)
    ...  # Note that you can pass callables to `formula1` and `formula2` keys
  },
  "header_style": str,  # The name of a style that already exists in the SpreadsheetConfig to which this sheet belongs
  "style": str,  # The name of a style that already exists in the SpreadsheetConfig to which this sheet belongs
  "number_format": str # A value among FORMAT_* constants from openpyxl.styles.numbers or others values supported by XLSX spec
},

Note on validation

To create a boolean validation, you may use a list validation type with django_spreadsheets.constants.YES_NO_CHOICES as formula1 value. To ensure a boolean value is displayed in your exported file, add "number_format": "BOOLEAN" in your column configuration.

The formula1 and formula2 keys accept callable as values. As the attributes of the SheetConfig class are assigned at initialization, you must use a callable to retrieve values from the database. Otherwise, those values will not be refreshed until the django process restart.

Note on number format

When the value in your exported file is not displayed correctly or not consistent across rows, you may define the number_format key in your column configuration.

Openxpyxl's format constants are not exhaustive. You may use other supported strings (eg: "BOOLEAN").

Export data

If you want to export data to this sheet, you need to define a get_data_to_export(self, request) method. This method must return a python list containing an item by row. Each item in the list is a list containing a value for each column.

def get_data_to_export(self, request):
  return [
    [1, "yellow", 20],
    [1, "green", 22],
  ]

Settings

SPREADSHEETS_CONFIG_SHEET_TITLE

Default: _("Configuration")

The name of the (hidden) config sheet in the generated file. This special sheet is used to store data needed for validation (eg. lists of choices).

SPREADSHEETS_MAX_VALIDATION_ROWS

Default: 700

When exporting data, the generated spreadsheet will have at least SPREADSHEETS_MAX_VALIDATION_ROWS rows with data validation set up.

SPREADSHEETS_MAX_STYLED_ROWS

Default: 700

When exporting data, the generated spreadsheet will have at least SPREADSHEETS_MAX_STYLED_ROWS rows styled.

SPREADSHEETS_MAX_COL_WIDTH_AND_ROW_HEIGHT

Default: 1500

When exporting data, the generated spreadsheet will calculate max width of column and height for rows for at least SPREADSHEETS_MAX_COL_WIDTH_AND_ROW_HEIGHT_CHECK rows.

SPREADSHEETS_COMMENT_AUTHOR

Default: "Django Spreadsheet"

The name of the author used when adding comments on cells.

SPREADSHEETS_VALIDATION_LIST_REPLACEMENT_DELIMITER

Default: "/"

The delimiter used to replace , found in list items as , is not permitted inside list items.

SPREADSHEETS_EXPORT_FILE_NAME

Default: "export {config_name} {date}.xlsx"

The name of the exported file when downloaded. Token config_name and date will be replaced respectively by the name of the configuration that is currently exported and the current date.

SPREADSHEETS_TEMPLATE_FILE_NAME

Default: "{config_name} template file.xlsx"

The name of the template file when downloaded. Token config_name will be replaced by the name of the configuration that is currently exported.

SPREADSHEETS_CSV_EXPORT_FILE_NAME

Default: export {config_name} {date}.csv

The name of the exported csv file when downloaded. Token config_name and date will be replaced respectively by the name of the configuration that is currently exported and the current date.

SPREADSHEETS_CSV_DELIMITER

Default: ,

The delimiter that will be used in the csv export.

Tests

Launch them using this command:

poetry install --with dev
poetry run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_spreadsheets-1.6.1.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

django_spreadsheets-1.6.1-py3-none-any.whl (38.8 kB view details)

Uploaded Python 3

File details

Details for the file django_spreadsheets-1.6.1.tar.gz.

File metadata

  • Download URL: django_spreadsheets-1.6.1.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.11.2 Linux/6.8.0-124-generic

File hashes

Hashes for django_spreadsheets-1.6.1.tar.gz
Algorithm Hash digest
SHA256 f49310db1311a82e50f85ce41660052a535c49a2739d44404bbec96e47f31ae1
MD5 7a8542c5782be153091731222e079633
BLAKE2b-256 c978df3ab075775263c8e5100a0ee9594fce22f1b932ded667ab0a35f0150778

See more details on using hashes here.

File details

Details for the file django_spreadsheets-1.6.1-py3-none-any.whl.

File metadata

  • Download URL: django_spreadsheets-1.6.1-py3-none-any.whl
  • Upload date:
  • Size: 38.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.11.2 Linux/6.8.0-124-generic

File hashes

Hashes for django_spreadsheets-1.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 35e8ae8c9c507ab563eeea8c159f69396cf2d6529522f61b12d52af6f2f9a66b
MD5 543b8cd592b44d22a02b4b089aba89f5
BLAKE2b-256 b2f500e90fa64e80088c5f443ac01532c22a4cefef22e5ef85c5659c313217c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page