Skip to main content

Read very large Excel XLSX files efficiently

Project description

xlsx-reader

Python3 library optimised for reading very large Excel XLSX files, including those with hundreds of columns as well as rows.

Simple example

from xlsxr import Workbook

workbook = Workbook(filename="myworkbook.xlsx", convert_values=True)

for sheet in workbook.sheets:
    print("Sheet ", sheet.name)
    for row in sheet.rows:
        print(row)

Conversions

By default, everything is a string, and all dates and datetimes will appear in ISO 8601 format (YYYY-mm-dd or YYYY-mm-ddTHH:MM:SS). If you supply the option convert_values to the Worksheet constructor, the library will convert numbers to ints or floats, and dates to datetime.datetime or datetime.date objects. There is no attempt to handle standalone times.

Empty cells appear as the empty string ''.

xlsxr.workbook.Workbook class

Constructor

The xlsxr.Workbook class constructor takes the following keyword arguments:

Argument Description
filename Path to an Excel file on the local filesystem.
stream A file-like object (byte stream)
url The URL of a remote Excel file
convert_values If True, convert numbers and dates from strings to Python values (default is False)
fill_merged If True, repeat values to fill merged areas (default is False)

You may specify only one of filename, stream, or url.

Properties

Property Description
sheets A list of xlsxr.sheet.Sheet objects
styles A list of xlsxr.style.Style objects

xlsxr.sheet.Sheet class

Properties

Property | Description workbook | The parent Workbook objet name | The name of the sheet sheet_id | The internal identifier of the sheet state | The state of the sheet (normally 'visible' or 'hidden') relation_id | ?? cols | A list of metadata for each column. rows | A list of the data rows in the sheet (parsed on demand). merges | A list of merges in the sheet (parsed on demand).

Each row is a list of scalar values. The will all be strings or None unless you specified the convert_values option for the Workbook.

Merges appear as strings defining ranges, e.g. "A1:C3".

Columns

Columns are represented as dict objects with the following properties:

Key Description
collapsed True if the column is collapsed.
hidden True if the column is hidden
min ??
max ??
style A key into the styles property of the workbook.

xlsxr.style.Style class

Properties

Property Description
number_formats ??
cell_style_formats ??
cell_formats ??
cell_styles ??

License

This is free and unencumbered software released into the public domain. See UNLICENSE.md for details.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlsxr-0.1.tar.gz (8.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page