Fix Mac Excel corruption in openpyxl-generated .xlsx files
Project description
xlsx-fixer
Fix the Mac Excel "We found a problem with some content" error in openpyxl-generated .xlsx files.
openpyxl hardcodes inline strings (t="inlineStr") for every text cell. Mac Excel's strict OOXML parser rejects this, showing a recovery dialog on every open. Windows Excel silently accepts it, which is why the bug goes unnoticed during development.
xlsx-fixer rewrites the ZIP to use a proper shared string table (xl/sharedStrings.xml), removes inconsistent calc state, and strips illegal control characters. One function call. Zero dependencies beyond the standard library.
The Problem
If you generate .xlsx files with Python's openpyxl library and open them on Mac, you see:
"We found a problem with some content in 'file.xlsx'. Do you want us to try to recover as much as we can?"
After clicking "Yes":
"Removed Records: String properties from /xl/sharedStrings.xml part"
This happens because openpyxl writes every text cell as an inline string (<c t="inlineStr"><is><t>text</t></is></c>) instead of referencing a shared string table. The openpyxl maintainer has declined to fix this for 17+ years.
Install
pip install xlsx-fixer
Zero dependencies. Uses only Python standard library (zipfile, xml.etree.ElementTree).
Usage
Python API
from xlsx_fixer import fix, check
# Fix in-place (after openpyxl wb.save())
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws["A1"] = "Hello, Mac Excel!"
wb.save("report.xlsx")
fix("report.xlsx") # That's it. Mac-safe now.
# Fix to a new file
fix("report.xlsx", output="report_fixed.xlsx")
# Check without modifying
issues = check("report.xlsx")
for issue in issues:
print(f"[{issue.severity}] {issue.code}: {issue.message}")
CLI
# Fix in-place
xlsx-fixer fix report.xlsx
# Fix to new file
xlsx-fixer fix report.xlsx -o fixed.xlsx
# Check without modifying
xlsx-fixer check report.xlsx
# Version
xlsx-fixer --version
Integration with openpyxl
Add two lines to your existing code:
from openpyxl import Workbook
from xlsx_fixer import fix
wb = Workbook()
# ... your existing workbook code ...
wb.calculation.fullCalcOnLoad = None # Remove inconsistent calc state
wb.save("output.xlsx")
fix("output.xlsx") # Convert inline strings to shared string table
What It Fixes
| # | Issue | Root Cause | Impact |
|---|---|---|---|
| 1 | Inline strings | openpyxl hardcodes t="inlineStr" in cell/_writer.py |
"We found a problem" dialog on every Mac open |
| 2 | fullCalcOnLoad | openpyxl sets fullCalcOnLoad="1" without generating calcChain.xml |
Recovery dialog; formulas show 0 |
| 3 | Stale calcChain.xml | References cells that no longer contain formulas | "Removed Records: Formula" in repair log |
| 4 | Control characters | Illegal XML 1.0 chars (U+0000-U+0008, etc.) in cell values | ST_Xstring validation failures |
How It Works
- Reads the entire
.xlsxZIP into memory - Parses every
xl/worksheets/sheet*.xml - Finds all
<c t="inlineStr"><is><t>TEXT</t></is></c>cells - Builds a deduplicated shared string table
- Rewrites each cell as
<c t="s"><v>INDEX</v></c> - Creates
xl/sharedStrings.xmlwith<sst>root element - Adds relationship to
xl/_rels/workbook.xml.rels - Adds Override to
[Content_Types].xml - Removes
fullCalcOnLoadfrom<calcPr>inxl/workbook.xml - Optionally removes stale
calcChain.xml - Writes new ZIP back to the same (or specified output) path
The fix is idempotent — running it twice is safe and the second run is a no-op.
Performance
The fix operates entirely in-memory. On a real production workbook (11 sheets, 195 properties, 448 verification checks, ~100KB):
- ~30ms on Apple Silicon
- ~50ms on Intel Mac
- Scales linearly with file size
Why Not Just Fix openpyxl?
The maintainer has permanently refused to add shared string table support to the writer. The inline string behavior is hardcoded in cell/_writer.py lines 21-22 and 70-79 with no API flag to change it. This isn't a bug they plan to fix — it's a design decision from 2007.
With 249M+ openpyxl downloads per month and 7,400+ dependent packages, this affects an enormous number of Python developers who generate Excel files for Mac users.
Tested On
- Python 3.9 - 3.14
- openpyxl 3.0.x - 3.1.x
- Mac Excel 16.x (Microsoft 365)
- Windows Excel (passes through unchanged)
- LibreOffice Calc (passes through unchanged)
License
MIT License. See LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xlsx_fixer-1.0.0.tar.gz.
File metadata
- Download URL: xlsx_fixer-1.0.0.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c806d38caf3345d3cef53e15cdf6c60fa85de3e3d5e96969a6763181c7bba7ae
|
|
| MD5 |
9b1830ef0ccfee9fd1b319f531e33514
|
|
| BLAKE2b-256 |
be6f333a54d00b2d09dd015fa1339729ddbe1d79514c345a4045e806484f06cb
|
File details
Details for the file xlsx_fixer-1.0.0-py3-none-any.whl.
File metadata
- Download URL: xlsx_fixer-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb073ab2640ee4ccde81887cb804f6bb1100f661e595f2d530bd6873a5345048
|
|
| MD5 |
c1e3e879c76fe3735eef4865aa1f8ac7
|
|
| BLAKE2b-256 |
9fda4e1aca4dd72f78b4c6512bafb64a9e86606e21dfaa313185d38b00e7a4a3
|