Skip to main content

Extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings or openpyxl. CLI and Python API.

Project description

Excel Extractor (xlwings)

A Python library to extract Excel formulas, validations, hyperlinks, notes, and rich cell metadata using xlwings. Provides a simple Python API and command-line tools.

Install

  • Requirements: Windows, Microsoft Excel, Python 3.8+
  • Install deps in your project: pip install xlwings
  • This repo includes a ready-to-package library under excel_extractor/.
  • After publishing: pip install excel-extractor-v1

Cross-platform (openpyxl engine)

  • On macOS/Linux or Windows without Excel, use the openpyxl engine:
    • CLI: add --engine openpyxl
    • Programmatic: use from excel_extractor import OpenpyxlExcelExtractor
    • Note: display text and live calc values are limited because openpyxl does not evaluate formulas.

CLI

  • Basic (all formulas on active sheet):
python -m excel_extractor "Workbook.xlsx"  # defaults to xlwings on Windows, openpyxl elsewhere
  • Force engine:
python -m excel_extractor "Workbook.xlsx" --engine openpyxl
python -m excel_extractor "Workbook.xlsx" --engine xlwings
  • Specific worksheet:
python -m excel_extractor "Workbook.xlsx" --sheet "Sheet1"
  • Range only:
python -m excel_extractor "Workbook.xlsx" --range "A1:D10"
  • Formula dependencies for a cell:
python -m excel_extractor "Workbook.xlsx" --dependencies "B5"
  • Full details (formatting, validations, hyperlinks, notes):
python -m excel_extractor "Workbook.xlsx" --full
  • Full details for all sheets:
python -m excel_extractor "Workbook.xlsx" --full --all-sheets
  • Text output instead of JSON:
python -m excel_extractor "Workbook.xlsx" --format text
  • Convert a previously generated *_full_details.json into per-sheet CSV/JSON and index:
python -m excel_extractor.convert_excel_json "Workbook_full_details.json" --out exports --ndjson

Python API

from excel_extractor import ExcelFormulaExtractor, OpenpyxlExcelExtractor

# 1) Windows + Excel (xlwings)
with ExcelFormulaExtractor("Workbook.xlsx") as extractor:
    data = extractor.extract_sheet_full_details("Sheet1")

# 2) Cross-platform (openpyxl)
with OpenpyxlExcelExtractor("Workbook.xlsx") as extractor:
    data = extractor.extract_sheet_full_details("Sheet1")

Public API (summary)

  • Class ExcelFormulaExtractor(excel_file_path: str) (xlwings)

  • Class OpenpyxlExcelExtractor(excel_file_path: str) (openpyxl)

    • Context manager: opens/quits workbook automatically
    • get_worksheet_info(sheet_name: Optional[str]) -> dict
    • extract_formulas_from_range(start_cell: str, end_cell: Optional[str]) -> list[dict]
    • extract_all_formulas(sheet_name: Optional[str]) -> dict
    • extract_sheet_full_details(sheet_name: Optional[str]) -> dict
    • extract_workbook_full_details() -> dict
    • extract_formula_dependencies(cell_address: str) -> dict
    • export_to_json(data: dict, output_file: str) -> bool
    • export_to_text(data: dict, output_file: str) -> bool
  • Console scripts (after packaging):

    • excel-extractor → same as python -m excel_extractor
    • excel-extractor-convert → same as python -m excel_extractor.convert_excel_json
  • Programmatic converter (optional):

    • excel_extractor.tools.convert_full_details_json(input_json: Path, output_dir: Path, make_ndjson: bool) -> None

Notes

  • Full-detail extraction returns, per cell: value, formula, display text (xlwings only), basic formatting (number format, font, alignment), fill color, hyperlink, note/comment, data validation (including resolved list items when possible), and merge info.
  • xlwings automation requires local Excel. Set the app visible for debugging by editing the code path that creates xw.App(visible=False).

License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excel_extractor_v1-0.1.0.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excel_extractor_v1-0.1.0-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file excel_extractor_v1-0.1.0.tar.gz.

File metadata

  • Download URL: excel_extractor_v1-0.1.0.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for excel_extractor_v1-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0f6e21ccdbcc8e4d19e708100b1ef424bffbf818a63df62d64f371241aac65be
MD5 c65e625521d94dca1af0eaad3154cff6
BLAKE2b-256 221809f8a84e94a57b802158b78a5a1c80e3ff6e58acddb9e7d6be348effaa6a

See more details on using hashes here.

File details

Details for the file excel_extractor_v1-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for excel_extractor_v1-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 22d37b8f7ca2c3721770bd602afd87b5ef505ad89f470711da67b6cd6c106da4
MD5 245d863bf2dee2800addcef01cc843a8
BLAKE2b-256 02bf91e3fde53bbe57dcc00553ad59a13a9649c1c2f1472d2ce402dc7b1f0047

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page