Skip to main content

Fix Mac Excel corruption in openpyxl-generated .xlsx files

Project description

xlsx-fixer

Fix the Mac Excel "We found a problem with some content" error in openpyxl-generated .xlsx files.

openpyxl hardcodes inline strings (t="inlineStr") for every text cell. Mac Excel's strict OOXML parser rejects this, showing a recovery dialog on every open. Windows Excel silently accepts it, which is why the bug goes unnoticed during development.

xlsx-fixer rewrites the ZIP to use a proper shared string table (xl/sharedStrings.xml), removes inconsistent calc state, and strips illegal control characters. One function call. Zero dependencies beyond the standard library.

The Problem

If you generate .xlsx files with Python's openpyxl library and open them on Mac, you see:

"We found a problem with some content in 'file.xlsx'. Do you want us to try to recover as much as we can?"

After clicking "Yes":

"Removed Records: String properties from /xl/sharedStrings.xml part"

This happens because openpyxl writes every text cell as an inline string (<c t="inlineStr"><is><t>text</t></is></c>) instead of referencing a shared string table. The openpyxl maintainer has declined to fix this for 17+ years.

Install

pip install xlsx-fixer

Zero dependencies. Uses only Python standard library (zipfile, xml.etree.ElementTree).

Usage

Python API

from xlsx_fixer import fix, check

# Fix in-place (after openpyxl wb.save())
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws["A1"] = "Hello, Mac Excel!"
wb.save("report.xlsx")

fix("report.xlsx")  # That's it. Mac-safe now.

# Fix to a new file
fix("report.xlsx", output="report_fixed.xlsx")

# Check without modifying
issues = check("report.xlsx")
for issue in issues:
    print(f"[{issue.severity}] {issue.code}: {issue.message}")

CLI

# Fix in-place
xlsx-fixer fix report.xlsx

# Fix to new file
xlsx-fixer fix report.xlsx -o fixed.xlsx

# Check without modifying
xlsx-fixer check report.xlsx

# Version
xlsx-fixer --version

Integration with openpyxl

Add two lines to your existing code:

from openpyxl import Workbook
from xlsx_fixer import fix

wb = Workbook()
# ... your existing workbook code ...

wb.calculation.fullCalcOnLoad = None  # Remove inconsistent calc state
wb.save("output.xlsx")
fix("output.xlsx")  # Convert inline strings to shared string table

What It Fixes

# Issue Root Cause Impact
1 Inline strings openpyxl hardcodes t="inlineStr" in cell/_writer.py "We found a problem" dialog on every Mac open
2 fullCalcOnLoad openpyxl sets fullCalcOnLoad="1" without generating calcChain.xml Recovery dialog; formulas show 0
3 Stale calcChain.xml References cells that no longer contain formulas "Removed Records: Formula" in repair log
4 Control characters Illegal XML 1.0 chars (U+0000-U+0008, etc.) in cell values ST_Xstring validation failures

How It Works

  1. Reads the entire .xlsx ZIP into memory
  2. Parses every xl/worksheets/sheet*.xml
  3. Finds all <c t="inlineStr"><is><t>TEXT</t></is></c> cells
  4. Builds a deduplicated shared string table
  5. Rewrites each cell as <c t="s"><v>INDEX</v></c>
  6. Creates xl/sharedStrings.xml with <sst> root element
  7. Adds relationship to xl/_rels/workbook.xml.rels
  8. Adds Override to [Content_Types].xml
  9. Removes fullCalcOnLoad from <calcPr> in xl/workbook.xml
  10. Optionally removes stale calcChain.xml
  11. Writes new ZIP back to the same (or specified output) path

The fix is idempotent — running it twice is safe and the second run is a no-op.

Performance

The fix operates entirely in-memory. On a real production workbook (11 sheets, 195 properties, 448 verification checks, ~100KB):

  • ~30ms on Apple Silicon
  • ~50ms on Intel Mac
  • Scales linearly with file size

Why Not Just Fix openpyxl?

The maintainer has permanently refused to add shared string table support to the writer. The inline string behavior is hardcoded in cell/_writer.py lines 21-22 and 70-79 with no API flag to change it. This isn't a bug they plan to fix — it's a design decision from 2007.

With 249M+ openpyxl downloads per month and 7,400+ dependent packages, this affects an enormous number of Python developers who generate Excel files for Mac users.

Tested On

  • Python 3.9 - 3.14
  • openpyxl 3.0.x - 3.1.x
  • Mac Excel 16.x (Microsoft 365)
  • Windows Excel (passes through unchanged)
  • LibreOffice Calc (passes through unchanged)

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlsx_fixer-1.1.0.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlsx_fixer-1.1.0-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file xlsx_fixer-1.1.0.tar.gz.

File metadata

  • Download URL: xlsx_fixer-1.1.0.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for xlsx_fixer-1.1.0.tar.gz
Algorithm Hash digest
SHA256 32310422d6b871e9e059dd091d5550cafd2a645fe573e96ea6d95e0e0512eef3
MD5 d42ccf63843766b0c3eba6f36a61fec6
BLAKE2b-256 7aaf53ce426ff79c85f9ac523669806c279a8c4f20371cf0b7128060b0f7b802

See more details on using hashes here.

File details

Details for the file xlsx_fixer-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: xlsx_fixer-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for xlsx_fixer-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f455d6e75d29becf5e2c1011a3e75137c545b086ee649cb2a1eaf0f7c1bfc94a
MD5 ca54910c5a10c79db7a0730a3871f649
BLAKE2b-256 583f074c001a7a9087ebc1add65f09e178d3d15b90d15e135893fb2aa8d33c34

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page