Skip to main content

Load data from spreadsheets easily

Project description

Superspreader 🦠

Superspreader is a little helper library that simplifies working with spreadsheets. It is built on top of openpyxl. OpenPyXL is its only dependency.

Instead of looping over rows and columns manually, the structure of a spreadsheet is described in a class:

from superspreader import fields
from superspreader.sheets import BaseSheet


class AlbumSheet(BaseSheet):
    """
    This class describes a sheet in an Excel document
    """

    sheet_name = "Albums" # The sheet is named “albums”
    header_rows = 3 # The sheet has three header rows

    # The column labels are in the second row.
    # It is *not* zero based to match the Excel row number
    label_row = 2


    # The columns
    artist = fields.CharField(source="Artist", required=True)
    album = fields.CharField(source="Album")
    release_date = fields.DateField(source="Release Date")
    average_review = fields.FloatField(source="Average Review")
    chart_position = fields.IntegerField(source="Chart Position")

Ready? Let’s load an Excel spreadsheet!

if __name__ == "__main__":
    sheet = AlbumSheet("albums.xlsx")
    # Load and parse data from the document
    sheet.load()

    print(sheet.has_errors)
    # False
    print(sheet.errors)
    # []
    print(sheet.infos)
    # []

    for row_dict in sheet:
        print(row_dict)

# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5}
# {'artist': 'The Wombats', 'album': 'Fix Yourself, Not The World', 'release_date': datetime.date(2022, 3, 7), 'average_review': 3.9, 'chart_position': 7}
# {'artist': 'Kokoroko', 'album': 'Could We Be More', 'release_date': datetime.date(2022, 8, 1), 'average_review': 4.7, 'chart_position': 30}

In tests/spreadsheets is a sample spreadsheet that is used for testing. Feel free to fiddle around.

There’s a lot more to say and I’ll update the documentation as I go.

Field params

Fields must have a sourceparameter, that holds the column name for the spreadsheet.unique=True` may be used to indicate that a field’s value must be unique.

Adding static & dynamic data to rows

To provide additional data, use extra_data. Data from the spreadsheet take precedence over extra data.

extra_data = {
    "status": "released"
}
sheet = AlbumSheet("albums.xlsx", extra_data=extra_data)
sheet.load()
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5, 'status': 'released'}

Use a callable for dynamic extra data:

extra_data = {
    "summary": lambda row: f"“{row.get('album')}” by {row.get('artist')}"
}

sheet = AlbumSheet("albums.xlsx", extra_data=extra_data)
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5, 'summary': '“Toy” by David Bowie'}

Changelog

0.2.7

  • Adds support for unique validation

0.2.3

  • Adds support for inheriting sheets (before that, fields from base classes weren’t recognized)

0.2.2

  • Adds support for callables in extra_data

0.2.1

  • Adds support for providing field defaults by setting the default attribute or providing an instance-label value: fields.CharField(source="Album", default="not specified")

The API is inspired by Django’s model API and ElasticSearch DSL.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superspreader-0.2.7.2.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

superspreader-0.2.7.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file superspreader-0.2.7.2.tar.gz.

File metadata

  • Download URL: superspreader-0.2.7.2.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for superspreader-0.2.7.2.tar.gz
Algorithm Hash digest
SHA256 9b5d0dc64e347823be345cef358a071b1dc8407aaa8b8f8a7e7e20c108647803
MD5 b3d757e3dc8f401e888a8005e01728fe
BLAKE2b-256 6992384a09d237f40f024a9de0df3ee7eb024f2f8ee4ed7a8d8dc4d13d004f2c

See more details on using hashes here.

File details

Details for the file superspreader-0.2.7.2-py3-none-any.whl.

File metadata

File hashes

Hashes for superspreader-0.2.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 072cc2e1b078e2713a06445a18afd90561aff8c7850148951e49332e017467b5
MD5 7e75b9a923b1fdca3b3e737f2e6b5e50
BLAKE2b-256 558553ba0fee8c78fa1e824bb972764dbad7e9dd8b97122fe64c91b18fe6f7e6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page