Skip to main content

Fix Mac Excel corruption in openpyxl-generated .xlsx files

Project description

xlsx-fixer

Fix the Mac Excel "We found a problem with some content" error in openpyxl-generated .xlsx files.

openpyxl hardcodes inline strings (t="inlineStr") for every text cell. Mac Excel's strict OOXML parser rejects this, showing a recovery dialog on every open. Windows Excel silently accepts it, which is why the bug goes unnoticed during development.

xlsx-fixer rewrites the ZIP to use a proper shared string table (xl/sharedStrings.xml), removes inconsistent calc state, and strips illegal control characters. One function call. Zero dependencies beyond the standard library.

The Problem

If you generate .xlsx files with Python's openpyxl library and open them on Mac, you see:

"We found a problem with some content in 'file.xlsx'. Do you want us to try to recover as much as we can?"

After clicking "Yes":

"Removed Records: String properties from /xl/sharedStrings.xml part"

This happens because openpyxl writes every text cell as an inline string (<c t="inlineStr"><is><t>text</t></is></c>) instead of referencing a shared string table. The openpyxl maintainer has declined to fix this for 17+ years.

Install

pip install xlsx-fixer

Zero dependencies. Uses only Python standard library (zipfile, xml.etree.ElementTree).

Usage

Python API

from xlsx_fixer import fix, check

# Fix in-place (after openpyxl wb.save())
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws["A1"] = "Hello, Mac Excel!"
wb.save("report.xlsx")

fix("report.xlsx")  # That's it. Mac-safe now.

# Fix to a new file
fix("report.xlsx", output="report_fixed.xlsx")

# Check without modifying
issues = check("report.xlsx")
for issue in issues:
    print(f"[{issue.severity}] {issue.code}: {issue.message}")

CLI

# Fix in-place
xlsx-fixer fix report.xlsx

# Fix to new file
xlsx-fixer fix report.xlsx -o fixed.xlsx

# Check without modifying
xlsx-fixer check report.xlsx

# Version
xlsx-fixer --version

Integration with openpyxl

Add two lines to your existing code:

from openpyxl import Workbook
from xlsx_fixer import fix

wb = Workbook()
# ... your existing workbook code ...

wb.calculation.fullCalcOnLoad = None  # Remove inconsistent calc state
wb.save("output.xlsx")
fix("output.xlsx")  # Convert inline strings to shared string table

What It Fixes

# Issue Root Cause Impact
1 Inline strings openpyxl hardcodes t="inlineStr" in cell/_writer.py "We found a problem" dialog on every Mac open
2 fullCalcOnLoad openpyxl sets fullCalcOnLoad="1" without generating calcChain.xml Recovery dialog; formulas show 0
3 Stale calcChain.xml References cells that no longer contain formulas "Removed Records: Formula" in repair log
4 Control characters Illegal XML 1.0 chars (U+0000-U+0008, etc.) in cell values ST_Xstring validation failures

How It Works

  1. Reads the entire .xlsx ZIP into memory
  2. Parses every xl/worksheets/sheet*.xml
  3. Finds all <c t="inlineStr"><is><t>TEXT</t></is></c> cells
  4. Builds a deduplicated shared string table
  5. Rewrites each cell as <c t="s"><v>INDEX</v></c>
  6. Creates xl/sharedStrings.xml with <sst> root element
  7. Adds relationship to xl/_rels/workbook.xml.rels
  8. Adds Override to [Content_Types].xml
  9. Removes fullCalcOnLoad from <calcPr> in xl/workbook.xml
  10. Optionally removes stale calcChain.xml
  11. Writes new ZIP back to the same (or specified output) path

The fix is idempotent — running it twice is safe and the second run is a no-op.

Performance

The fix operates entirely in-memory. On a real production workbook (11 sheets, 195 properties, 448 verification checks, ~100KB):

  • ~30ms on Apple Silicon
  • ~50ms on Intel Mac
  • Scales linearly with file size

Why Not Just Fix openpyxl?

The maintainer has permanently refused to add shared string table support to the writer. The inline string behavior is hardcoded in cell/_writer.py lines 21-22 and 70-79 with no API flag to change it. This isn't a bug they plan to fix — it's a design decision from 2007.

With 249M+ openpyxl downloads per month and 7,400+ dependent packages, this affects an enormous number of Python developers who generate Excel files for Mac users.

Tested On

  • Python 3.9 - 3.14
  • openpyxl 3.0.x - 3.1.x
  • Mac Excel 16.x (Microsoft 365)
  • Windows Excel (passes through unchanged)
  • LibreOffice Calc (passes through unchanged)

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlsx_fixer-1.0.0.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlsx_fixer-1.0.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file xlsx_fixer-1.0.0.tar.gz.

File metadata

  • Download URL: xlsx_fixer-1.0.0.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for xlsx_fixer-1.0.0.tar.gz
Algorithm Hash digest
SHA256 c806d38caf3345d3cef53e15cdf6c60fa85de3e3d5e96969a6763181c7bba7ae
MD5 9b1830ef0ccfee9fd1b319f531e33514
BLAKE2b-256 be6f333a54d00b2d09dd015fa1339729ddbe1d79514c345a4045e806484f06cb

See more details on using hashes here.

File details

Details for the file xlsx_fixer-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: xlsx_fixer-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for xlsx_fixer-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb073ab2640ee4ccde81887cb804f6bb1100f661e595f2d530bd6873a5345048
MD5 c1e3e879c76fe3735eef4865aa1f8ac7
BLAKE2b-256 9fda4e1aca4dd72f78b4c6512bafb64a9e86606e21dfaa313185d38b00e7a4a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page