Skip to main content

A blazingly fast spreadsheet parser for .xlsx files

Project description

SheetReader Python Bindings

SheetReader allows to read your Excel spreadsheet files (.xlsx) blazingly fast. This repository contains the Python bindings, as the core library is implemented in C++.

Quickstart

Sheetreader is available through:

pip install pysheetreader

After successful installation, spreadsheets can be loaded:

import pysheetreader as sr
sheet = sr.read_xlsx("my_favorite_sheet.xlsx")

To convert a spreadsheet into a pandas Dataframe:

import pysheetreader as sr
import pandas as pd
sheet = sr.read_xlsx("my_favorite_sheet.xlsx")
df = pd.DataFrame.from_dict(sheet[0])

Parameters:

Parameter Type Description Default
path string The path of the .xlsx file to parse. -
sheet integer or string The sheet of the file to parse, can be either the index (starting at 1) or the name. 1
headers boolean Whether to interpret the first parsed row as headers. True
skip_rows integer How many rows to skip before parsing data. 0
skip_columns integer How many columns to skip before parsing data. 0
num_threads integer How many threads to use for parsing. Use -1 for automatic threading. -1
col_types dict or list How to interpret parsed data, either by names (dict) or by position (list). Types: numeric, text, logical, date, skip, guess. None

Build Instructions

First install the submodules, which contain the sheetreader-core dependency with:

git clone --recurse-submodules https://github.com/polydbms/sheetreader-python.git

To build from source, this repository provides a pyproject.toml. The SheetReader wheel file can be generated through:

python -m build .

or installed with pip through:

pip install .

More resources

SheetReader is part of the PolyDB Project. We also provide bindings/extensions for several other environments:

Paper

SheetReader was published in the Information Systems Journal. Cite as:

@article{DBLP:journals/is/GavriilidisHZM23,
  author       = {Haralampos Gavriilidis and
                  Felix Henze and
                  Eleni Tzirita Zacharatou and
                  Volker Markl},
  title        = {SheetReader: Efficient Specialized Spreadsheet Parsing},
  journal      = {Inf. Syst.},
  volume       = {115},
  pages        = {102183},
  year         = {2023},
  url          = {https://doi.org/10.1016/j.is.2023.102183},
  doi          = {10.1016/J.IS.2023.102183},
  timestamp    = {Mon, 26 Jun 2023 20:54:32 +0200},
  biburl       = {https://dblp.org/rec/journals/is/GavriilidisHZM23.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pysheetreader-0.0.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

pysheetreader-0.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

pysheetreader-0.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

pysheetreader-0.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pysheetreader-0.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

pysheetreader-0.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file pysheetreader-0.0.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pysheetreader-0.0.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 82ca21399a26bb0bf9c169c2451977bb9490543b5a69fb9155568198eff0eeaa
MD5 b9ba7b3e57d8d590a74d69fe832c46a3
BLAKE2b-256 87714170427f2b4762ddfe9dd7e1df194c76145a12489a537351ad52aa0dd48c

See more details on using hashes here.

File details

Details for the file pysheetreader-0.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pysheetreader-0.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c4dafe75922d9718df81dc651dad9c732c573a340547fae0ca30adbc52612f74
MD5 5bdb280a90e62d268524a5756ae79fa8
BLAKE2b-256 a8214725b27fe82e1e4045e9befa48dda48922593b74f8ada1053709954db381

See more details on using hashes here.

File details

Details for the file pysheetreader-0.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pysheetreader-0.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 78bd296cb219b316a25e5b0fdffc5d6b2302a6791e06b4516c5d6203f71d9f08
MD5 1ccf3f173c18ad9e521d7f92246f818d
BLAKE2b-256 61cb8d9e54acb4a01063848f1e1df09bf0b859957113425b964fb394b1b6cabc

See more details on using hashes here.

File details

Details for the file pysheetreader-0.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pysheetreader-0.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 79d308dc9c9d26e57f0a2a73ff119027020a5bcb9dde7d0487cfc1c6a8c85410
MD5 1824357c41454a1b9737eea4d4eac3cf
BLAKE2b-256 3408c49313fac89046c61f9fc024324eb6cac9346d436cd8cb979d22fe89b904

See more details on using hashes here.

File details

Details for the file pysheetreader-0.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pysheetreader-0.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9fe0429f1f3c374ce8ba403e52e463b995b90a5a0845b732ba3febd86228ff3a
MD5 ad3357662d9d1f2d681c354b38b8057b
BLAKE2b-256 239e95429542f74a50c727d37b9ef9887b7f8b99d6e733cd74a115aaf333ec39

See more details on using hashes here.

File details

Details for the file pysheetreader-0.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pysheetreader-0.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9fc53dcd25b0e5b7ed1e4a813adc93b81989dc60432c06b9f0f51f59c10e217b
MD5 2b494a1ec52890d710b8b62340f7c370
BLAKE2b-256 9e97864c29e935f88659a0fc8e5131d175674cce78215ed800a18675380a283a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page