Load data from spreadsheets easily
Project description
Superspreader 🦠
Superspreader is a little helper library that simplifies working with spreadsheets. It is built on top of openpyxl. OpenPyXL is its only dependency.
Instead of looping over rows and columns manually, the structure of a spreadsheet is described in a class:
from superspreader import fields
from superspreader.sheets import BaseSheet
class AlbumSheet(BaseSheet):
"""
This class describes a sheet in an Excel document
"""
sheet_name = "Albums" # The sheet is named “albums”
header_rows = 3 # The sheet has three header rows
# The column labels are in the second row.
# It is *not* zero based to match the Excel row number
label_row = 2
# The columns
artist = fields.CharField(source="Artist", required=True)
album = fields.CharField(source="Album")
release_date = fields.DateField(source="Release Date")
average_review = fields.FloatField(source="Average Review")
chart_position = fields.IntegerField(source="Chart Position")
Ready? Let’s load an Excel spreadsheet!
if __name__ == "__main__":
sheet = AlbumSheet("albums.xlsx")
# Load and parse data from the document
sheet.load()
print(sheet.has_errors)
# False
print(sheet.errors)
# []
print(sheet.infos)
# []
for row_dict in sheet:
print(row_dict)
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5}
# {'artist': 'The Wombats', 'album': 'Fix Yourself, Not The World', 'release_date': datetime.date(2022, 3, 7), 'average_review': 3.9, 'chart_position': 7}
# {'artist': 'Kokoroko', 'album': 'Could We Be More', 'release_date': datetime.date(2022, 8, 1), 'average_review': 4.7, 'chart_position': 30}
In tests/spreadsheets
is a sample spreadsheet that is used for testing. Feel free to fiddle around.
There’s a lot more to say and I’ll update the documentation as I go.
Adding non-spreadsheet fields
To provide additional fields, use extra_data
. Fields from the spreadsheet take precedence over extra data.
extra_data = {
"status": "released"
}
sheet = AlbumSheet("albums.xlsx", extra_data=extra_data)
sheet.load()
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5, 'status': 'released'}
Use a callable for dynamic extra data:
extra_data = {
"summary": lambda row: f"{row.get('album')} by {row.get('artist')}"
}
sheet = AlbumSheet("albums.xlsx", extra_data=extra_data)
# {'artist': 'David Bowie', 'album': 'Toy', 'release_date': datetime.date(2022, 1, 7), 'average_review': 4.3, 'chart_position': 5, 'summary': 'Toy by David Bowie'}
Changelog
0.2.3
- Adds support for inheriting sheets (before that, fields from base classes weren’t recognized)
0.2.2
- Adds support for callables in `extra_data``
The API is inspired by Django’s model API and ElasticSearch DSL.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file superspreader-0.2.4.tar.gz
.
File metadata
- Download URL: superspreader-0.2.4.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e4bd369031777cc3e23b84250d19ae98e0bb09cd47d32d78364dafae4af93d9 |
|
MD5 | 31fa2a1e5e6e78f842ea2e04f1b6e716 |
|
BLAKE2b-256 | 3bcdec9cf2e402257ea307ff8fc8c7e2bf5521cf3ce5a9dee24efe13133e0d83 |
File details
Details for the file superspreader-0.2.4-py3-none-any.whl
.
File metadata
- Download URL: superspreader-0.2.4-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb9e160fea4fdc391da889528f3a47fa320e792afc64dab3d39105558a1a6dc9 |
|
MD5 | 78d57ce48e5f477345f9556b6c49e2ef |
|
BLAKE2b-256 | ab43ab90fb642b8694eb5c77b0e248c3b01e2484ea943a6d4ebaaf6bd8540f99 |