Skip to main content

Read and write XLSB and XLSX files efficiently.

Project description

Python XLSB Reader & Writer

A Python library for reading and writing XLSB and XLSX files efficiently.

Installation

pip install pyxlsbwriter

Usage

Basic Example

from pyxlsbwriter import XlsbWriter
import datetime
from decimal import Decimal

data = [
    ["Name", "Age", "City", "info"],
    [-123, 2147483647, 2147483648, 2147483999],
    ["x", "y", "z", datetime.datetime.today()],
    ["Alice", 25, "New York", datetime.date.today()],
    ["Bob", 30, "London", Decimal(3.14)],
    ["Charlie", 35, "Paris", datetime.datetime.now()],
    [True, False, None, datetime.datetime.utcnow()]
]

# Initialize writer with a specific compression level
with XlsbWriter("output.xlsb", compressionLevel=6) as writer:
    # Add a visible sheet
    writer.add_sheet("Visible Sheet")
    writer.write_sheet(data)

    # Add a hidden sheet
    writer.add_sheet("Hidden Sheet", hidden=True)
    writer.write_sheet([["This sheet is hidden."]])

XlsxWriter Example

from pyxlsbwriter import XlsxWriter
import datetime
from decimal import Decimal

data = [
    ["Name", "Age", "City", "info"],
    [-123, 2147483647, 2147483648, 2147483999],
    ["x", "y", "z", datetime.datetime.today()],
    ["Alice", 25, "New York", datetime.date.today()],
    ["Bob", 30, "London", Decimal(3.14)],
    ["Charlie", 35, "Paris", datetime.datetime.now()],
    [True, False, None, datetime.datetime.utcnow()]
]

# Initialize writer with a specific compression level
with XlsxWriter("output.xlsx", compressionLevel=6) as writer:
    # Add a visible sheet
    writer.add_sheet("Visible Sheet")
    writer.write_sheet(data)

    # Add a hidden sheet
    writer.add_sheet("Hidden Sheet", hidden=True)
    writer.write_sheet([["This sheet is hidden."]])

Reading XLSB and XLSX Files

Reading files is done via the ExcelReader class, which automatically detects the format.

from pyxlsbwriter import ExcelReader

with ExcelReader("input.xlsx") as reader:  # or .xlsb
    names = reader.get_sheet_names()
    print(f"Sheets: {names}")

    for sheet_name in names:
        rows = reader.read_all(sheet_name)
        for row in rows:
            print(row)

# Generator usage (memory efficient for large files):
with ExcelReader("large_file.xlsb") as reader:
    for row in reader.get_rows("Sheet1"):
        print(row)

Streaming from a Database (ODBC)

This example shows how to stream data directly from a database query into an XLSB file. This is highly memory-efficient as it doesn't load the entire dataset into memory.

First, ensure you have pyodbc installed:

pip install pyodbc

Then, you can use a generator function to feed data to XlsbWriter.

import os
import pyodbc
from typing import Generator
from pyxlsbwriter import XlsbWriter

# --- Configuration ---
# Make sure you have an ODBC driver and a configured DSN, or use a DSN-less connection string.
DSN = "DRIVER={Your ODBC Driver};SERVER=your_server;DATABASE=your_db;UID=your_user;PWD=your_password"
QUERY = "SELECT * FROM YourTable"
OUTPUT_FILENAME = "db_output.xlsb"

def row_generator(cursor: pyodbc.Cursor) -> Generator[list[any], None, None]:
    """
    Generates rows from a pyodbc cursor, yielding headers first, followed by data rows.
    """
    # Extract column headers from cursor description
    headers = [column[0] for column in cursor.description]
    yield headers

    # Yield each row until the cursor is exhausted
    while row := cursor.fetchone():
        yield list(row)

# --- Main Execution ---
try:
    # Connect to the database
    with pyodbc.connect(DSN) as conn:
        cursor = conn.cursor()
        cursor.execute(QUERY)

        # Use XlsbWriter to write the data stream
        with XlsbWriter(OUTPUT_FILENAME) as writer:
            writer.add_sheet("Database Export")
            writer.write_sheet(row_generator(cursor))
            
            # You can also add hidden sheets with metadata, like the query itself
            writer.add_sheet("SQL Query", hidden=True)
            writer.write_sheet([["SQL"], [QUERY]])


    print(f"Successfully created '{OUTPUT_FILENAME}'")

except pyodbc.Error as ex:
    sqlstate = ex.args[0]
    print(f"Database connection or query execution error: {sqlstate}\n{ex}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Performance

pyxlsbwriter is designed for high performance, both when writing and reading large datasets.

Writing

The binary .xlsb format is significantly more compact and faster to write compared to the standard XML-based .xlsx format.

A benchmark was performed with a 50000x50 dataset:

Library Time Taken File Size
pyxlsbwriter (XlsbWriter) 2.48s 7.52 MB
pyxlsbwriter (XlsxWriter) 5.02s 6.58 MB
xlsxwriter 11.56s 11.35 MB

Note: The xlsxwriter library offers significantly more features than pyxlsbwriter, which is why its performance is lower. pyxlsbwriter is optimized for high-performance writing of large datasets with a focus on speed and file size efficiency.

Reading

Dataset XLSX XLSB
65K rows x 14 cols ~1.9s ~2.4s
200K rows x 4 cols ~2.0s ~3.2s

XLSX reading uses the native C-based expat XML parser, while XLSB reading is implemented in pure Python (binary BIFF record parser), which explains the difference in read performance.

You can run the writing benchmark yourself using the script located in examples/performance_test.py.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyxlsbwriter-0.1.0.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyxlsbwriter-0.1.0-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file pyxlsbwriter-0.1.0.tar.gz.

File metadata

  • Download URL: pyxlsbwriter-0.1.0.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pyxlsbwriter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9de6750b1e5197aa18bf9d76a92b4731900d9fe1a73f6393cbd234ad183df828
MD5 6e42817f043edfbcfad5efc2ced371dc
BLAKE2b-256 7559a5da7af9df1cf4a1cb7517804b38b31a830fbe1b636097637414874899cf

See more details on using hashes here.

File details

Details for the file pyxlsbwriter-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyxlsbwriter-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pyxlsbwriter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f62bd36962f6d95335c8456e9705f00c4dd0f02da0655a2e817583569aa62eab
MD5 50da03a214bcd36bf02e24f5c6e2e29b
BLAKE2b-256 76092110e9394f7af9667a3fcd58cc47fe18b7b7003339394ff63b93aa5c86e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page