Extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings or openpyxl. CLI and Python API.
Project description
Excel Extractor (xlwings)
A Python library to extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings. Provides a simple Python API and command-line tools.
Install
- Requirements: Windows, Microsoft Excel, Python 3.8+
- Install deps in your project:
pip install xlwings - This repo includes a ready-to-package library under
excel_extractor/. - After publishing:
pip install excel-extractor-v1
Cross-platform (openpyxl engine)
- On macOS/Linux or Windows without Excel, use the
openpyxlengine:- CLI: add
--engine openpyxl - Programmatic: use
from excel_extractor import OpenpyxlExcelExtractor - Note: display text and live calc values are limited because openpyxl does not evaluate formulas.
- CLI: add
CLI
- Basic (all formulas on active sheet):
python -m excel_extractor "Workbook.xlsx" # defaults to xlwings on Windows, openpyxl elsewhere
- Force engine:
python -m excel_extractor "Workbook.xlsx" --engine openpyxl
python -m excel_extractor "Workbook.xlsx" --engine xlwings
- Specific worksheet:
python -m excel_extractor "Workbook.xlsx" --sheet "Sheet1"
- Range only:
python -m excel_extractor "Workbook.xlsx" --range "A1:D10"
- Formula dependencies for a cell:
python -m excel_extractor "Workbook.xlsx" --dependencies "B5"
- Full details (formatting, validations, hyperlinks, notes):
python -m excel_extractor "Workbook.xlsx" --full
- Full details for all sheets:
python -m excel_extractor "Workbook.xlsx" --full --all-sheets
- Text output instead of JSON:
python -m excel_extractor "Workbook.xlsx" --format text
- Convert a previously generated
*_full_details.jsoninto per-sheet CSV/JSON and index:
python -m excel_extractor.convert_excel_json "Workbook_full_details.json" --out exports --ndjson
Python API
from excel_extractor import ExcelFormulaExtractor, OpenpyxlExcelExtractor
# 1) Windows + Excel (xlwings)
with ExcelFormulaExtractor("Workbook.xlsx") as extractor:
data = extractor.extract_sheet_full_details("Sheet1")
# 2) Cross-platform (openpyxl)
with OpenpyxlExcelExtractor("Workbook.xlsx") as extractor:
data = extractor.extract_sheet_full_details("Sheet1")
Public API (summary)
-
Class
ExcelFormulaExtractor(excel_file_path: str)(xlwings) -
Class
OpenpyxlExcelExtractor(excel_file_path: str)(openpyxl)- Context manager: opens/quits workbook automatically
get_worksheet_info(sheet_name: Optional[str]) -> dictextract_formulas_from_range(start_cell: str, end_cell: Optional[str]) -> list[dict]extract_all_formulas(sheet_name: Optional[str]) -> dictextract_sheet_full_details(sheet_name: Optional[str]) -> dictextract_workbook_full_details() -> dictextract_formula_dependencies(cell_address: str) -> dictexport_to_json(data: dict, output_file: str) -> boolexport_to_text(data: dict, output_file: str) -> bool
-
Console scripts (after packaging):
excel-extractor→ same aspython -m excel_extractorexcel-extractor-convert→ same aspython -m excel_extractor.convert_excel_json
-
Programmatic converter (optional):
excel_extractor.tools.convert_full_details_json(input_json: Path, output_dir: Path, make_ndjson: bool) -> None
Notes
- Full-detail extraction returns, per cell: value, formula, display text (xlwings only), basic formatting (number format, font, alignment), fill color, hyperlink, note/comment, data validation (including resolved list items when possible), and merge info.
- xlwings automation requires local Excel. Set the app visible for debugging by editing the code path that creates
xw.App(visible=False).
License
MIT License. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file excel_extractor_v1-0.1.0.tar.gz.
File metadata
- Download URL: excel_extractor_v1-0.1.0.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f6e21ccdbcc8e4d19e708100b1ef424bffbf818a63df62d64f371241aac65be
|
|
| MD5 |
c65e625521d94dca1af0eaad3154cff6
|
|
| BLAKE2b-256 |
221809f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a
|
File details
Details for the file excel_extractor_v1-0.1.0-py3-none-any.whl.
File metadata
- Download URL: excel_extractor_v1-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22d37b8f7ca2c3721770bd602afd87b5ef505ad89f470711da67b6cd6c106da4
|
|
| MD5 |
245d863bf2dee2800addcef01cc843a8
|
|
| BLAKE2b-256 |
02bf91e3fde53bbe57dcc00553ad59a13a9649c1c2f1472d2ce402dc7b1f0047
|