Skip to main content

Load data from spreadsheets easily

Project description

Superspreader 🦠

Superspreader is a little helper library that simplifies working with spreadsheets. It is built on top of openpyxl. OpenPyXL is its only dependency.

Instead of looping over rows and columns manually, the structure of a spreadsheet is described in a class:

from superspreader import fields
from superspreader.sheets import BaseSheet


class AlbumSheet(BaseSheet):
    """
    This class describes a sheet in an Excel document
    """

    sheet_name = "Albums" # The sheet is named “albums”
    header_rows = 3 # The sheet has three header rows

    # The column labels are in the second row.
    # It is *not* zero based to match the Excel row number
    label_row = 2


    # The columns
    artist = fields.CharField(source="Artist", required=True)
    album = fields.CharField(source="Album")
    release_date = fields.DateField(source="Release Date")
    average_review = fields.FloatField(source="Average Review")
    chart_position = fields.IntegerField(source="Chart Position")

Ready? Let’s load an Excel spreadsheet!

if __name__ == "__main__":
    sheet = AlbumSheet("albums.xlsx")
    # Load and parse data from the document
    sheet.load()

    print(sheet.has_errors)
    # False
    print(sheet.errors)
    # []
    print(sheet.infos)
    # []

    for row_dict in sheet:
        print(row_dict)

# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5}
# {'artist': 'The Wombats', 'album': 'Fix Yourself, Not The World', 'release_date': datetime.date(2022, 3, 7), 'average_review': 3.9, 'chart_position': 7}
# {'artist': 'Kokoroko', 'album': 'Could We Be More', 'release_date': datetime.date(2022, 8, 1), 'average_review': 4.7, 'chart_position': 30}

In tests/spreadsheets is a sample spreadsheet that is used for testing. Feel free to fiddle around.

There’s a lot more to say and I’ll update the documentation as I go.

Adding non-spreadsheet fields

To provide additional fields, use extra_data. Fields from the spreadsheet take precedence over extra data.

extra_data = {
    "status": "released"
}
sheet = AlbumSheet("albums.xlsx", extra_data=extra_data)
sheet.load()
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5, 'status': 'released'}

Use a callable for dynamic extra data:

extra_data = {
    "summary": lambda row: f"{row.get('album')} by {row.get('artist')}"
}

sheet = AlbumSheet("albums.xlsx", extra_data=extra_data)
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5, 'summary': 'Toy by David Bowie'}

Changelog

0.2.3

  • Adds support for inheriting sheets (before that, fields from base classes weren’t recognized)

0.2.2

  • Adds support for callables in `extra_data``

The API is inspired by Django’s model API and ElasticSearch DSL.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superspreader-0.2.3.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

superspreader-0.2.3-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file superspreader-0.2.3.tar.gz.

File metadata

  • Download URL: superspreader-0.2.3.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for superspreader-0.2.3.tar.gz
Algorithm Hash digest
SHA256 759c8aa45d7c6d9e8357bc18d066e9b7ffb39c0de2a39a438037cb8e9eb3a713
MD5 be50cad7a16847792615eb0cbb46cdb9
BLAKE2b-256 0b8a2c04a1aca9b7ee02a8fba0928bdf4b47660c9f0a62f518f4ed6356e5a709

See more details on using hashes here.

File details

Details for the file superspreader-0.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for superspreader-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ae3a1d405f517c58c098c63746dfff5ddc08aabb19793c080dc202c4a3670afe
MD5 bc8677526df6ae4dba92dcc00b97ac10
BLAKE2b-256 4edb016304c2a4929105a0dea11291df938dd6fa0ae4e8ec1f0bad777d9b677e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page