Excel to Markdown converter with CSV markdown output support
Project description
excel2md
Excel to Markdown converter. Reads Excel workbooks (.xlsx/.xlsm) and automatically generates Markdown format output.
Features
- Smart Table Detection: Automatically detects Excel print areas and converts them to Markdown tables
- CSV Markdown Output: Exports entire sheets in CSV format with validation metadata
- Image Extraction: Extracts images from Excel files and outputs them as Markdown image links
- Mermaid Flowcharts: Generates Mermaid diagrams from Excel shapes and tables
- Hyperlink Support: Multiple output modes (inline, footnote, plain text)
- Split by Sheet: Generate individual files per sheet
- Customizable: Detailed settings for formatting, alignment, and data processing
Use Cases
- Document Generation: Convert Excel specifications to Markdown
- AI/LLM Processing: CSV markdown format optimized for token efficiency
- Flowchart Extraction: Extract diagrams from Excel shapes
- Data Migration: Export Excel data to portable Markdown format
- Version Control: Track Excel changes in text-based format
Documentation
- CHANGELOG.md - Version history
- CONTRIBUTING.md - Contribution guidelines
- SECURITY.md - Security policy and best practices
- Technical specifications are kept in each version directory as
spec.mdandspec_appendix.md
Installation
Requires Python 3.10 or higher.
pip install excel2md
# or with uv
uv add excel2md
After installation, the excel2md command is available on your PATH.
Usage
excel2md input.xlsx
This generates:
input_csv.md: CSV markdown format (default)input_images/: Image directory (if images exist)
Note
- Output filenames and directories are based on input filename (e.g.,
input.xlsx→input_csv.md,input_images/) - Output is saved in the same directory as input file (use
--csv-output-dirto change)
Common Examples
Convert with Mermaid flowchart support:
excel2md input.xlsx --mermaid-enabled
Generate individual files per sheet:
excel2md input.xlsx --split-by-sheet
Specify CSV markdown output directory:
excel2md input.xlsx --csv-output-dir ./output
# CSV markdown: ./output/input_csv.md
# Images: ./output/input_images/
Output standard Markdown only (no CSV output):
excel2md input.xlsx -o output.md --no-csv-markdown-enabled
Plain text hyperlinks (no Markdown syntax):
excel2md input.xlsx --hyperlink-mode inline_plain
Reduce token count (exclude CSV summary section):
excel2md input.xlsx --no-csv-include-description
Use as a Library
excel2md is also usable as a Python library.
from excel2md import convert_to_markdown
# Pass a path, or raw xlsx bytes (handy for Pyodide / web uploads)
result = convert_to_markdown("input.xlsx", csv_markdown_enabled=False)
print(result["markdown"]) # Generated Markdown string
print(result["output_path"]) # Where the .md file was written
CLI options map 1:1 to keyword arguments (e.g. mermaid_enabled=True, split_by_sheet=True). For multiple conversions sharing the same configuration, use ConversionConfig + ExcelConverter directly.
From source
git clone https://github.com/elvezjp/excel2md.git
cd excel2md
uv sync
See CONTRIBUTING.md for the full developer setup.
Key Options
Output Control
| Option | Default | Description |
|---|---|---|
--split-by-sheet |
false | Generate individual files per sheet |
--csv-markdown-enabled |
true | Enable CSV markdown output |
--csv-output-dir |
Same as input | Output directory for CSV markdown and images |
--csv-include-description |
true | Include summary section in CSV output |
--csv-include-metadata |
true | Include validation metadata in CSV output |
--image-extraction |
true | Enable image extraction |
-o, --output |
- | Output file path for standard Markdown |
Hyperlink Formats
| Mode | Description | Output Example |
|---|---|---|
inline |
Markdown format | [text](URL) |
inline_plain |
Plain text format | text (URL) |
footnote |
Footnote format | [text][^1] + [^1]: URL |
text_only |
Display text only | text |
both |
Inline + footnote | Both formats |
Mermaid Flowcharts
| Option | Default | Description |
|---|---|---|
--mermaid-enabled |
false | Enable Mermaid conversion |
--mermaid-detect-mode |
shapes | Detection mode: shapes, column_headers, heuristic |
--mermaid-direction |
TD | Flowchart direction: TD, LR, BT, RL |
--mermaid-keep-source-table |
true | Output original table along with Mermaid |
Table Processing
| Option | Default | Description |
|---|---|---|
--header-detection |
first_row | Treat first row as header |
--align-detection |
numbers_right | Right-align numeric columns |
--max-cells-per-table |
200000 | Maximum cells per table |
--no-print-area-mode |
used_range | Behavior when print area not set |
Advanced Options
List all options:
excel2md --help
Key advanced options:
- Cell merge policy
- Date/number format control
- Whitespace handling
- Markdown escape level
- Hidden row/column policy
- Locale-specific formatting
Output Examples
Real input / output samples (including images) live under docs/examples/. Each version directory contains:
- Input
.xlsxfiles output-default/— default mode (CSV markdown + image extraction)output-markdown/— standard Markdown mode (--no-csv-markdown-enabled)output-mermaid/— Mermaid flowchart enabled (--mermaid-enabled)
The regeneration commands for each pattern are documented in docs/examples/README.md.
Directory Structure
excel2md/
├── v2.2.1/ # Current package source
│ ├── excel_to_md.py # Entry point
│ ├── excel2md/ # Main package
│ ├── tests/ # Test suite
│ ├── spec.md # Specification
│ └── spec_appendix.md # Specification appendix
├── versions/ # Frozen historical snapshots (excluded from PyPI)
│ ├── README.md # Overview of this directory (Japanese)
│ └── v*/ # One subdirectory per past release
├── docs/ # Documentation
├── pyproject.toml # Project metadata
├── LICENSE # MIT License
├── README.md / _ja.md # README (English / Japanese)
├── CONTRIBUTING.md / _ja.md # Contribution guide (English / Japanese)
├── SECURITY.md / _ja.md # Security policy (English / Japanese)
└── CHANGELOG.md / _ja.md # Version history (English / Japanese)
Security
For security concerns, please see SECURITY.md.
Key security notes:
- Only process Excel files from trusted sources
- excel2md does not save changes to input workbooks; use
--read-onlywhen you prefer openpyxl read-only loading - Excel macros are not executed
- Sanitize Markdown output to prevent injection
Contributing
Contributions are welcome! See CONTRIBUTING.md for details.
- Report bugs via GitHub Issues
- Submit pull requests for improvements
- Follow existing code style
- Add tests for new features
Changelog
See CHANGELOG.md for details.
Background
This tool was created during the development of IXV, an AI development ecosystem designed for Japanese engineering teams.
IXV delivers a methodology and OSS that put AI to practical use in real development workflows. This repository publishes a portion of that work.
License
MIT License - See LICENSE for details.
Contact
- Email: info@elvez.co.jp
- Company: Elvez, Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file excel2md-2.2.1.tar.gz.
File metadata
- Download URL: excel2md-2.2.1.tar.gz
- Upload date:
- Size: 65.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abfd1bc9b5e8c35e52118f049726cf282903c37b09f21e4ce8fa2d7cc3aacb65
|
|
| MD5 |
60945028bfbaf20e1f14e30d6154e6f5
|
|
| BLAKE2b-256 |
0587eb9ee74e75679287e6148a5bf86290a98f9a0b8c20614db57bf2d6ac8010
|
Provenance
The following attestation bundles were made for excel2md-2.2.1.tar.gz:
Publisher:
publish.yml on elvezjp/excel2md
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
excel2md-2.2.1.tar.gz -
Subject digest:
abfd1bc9b5e8c35e52118f049726cf282903c37b09f21e4ce8fa2d7cc3aacb65 - Sigstore transparency entry: 1533403438
- Sigstore integration time:
-
Permalink:
elvezjp/excel2md@314860720859337cb76efe04a67ad4b74d1aba3c -
Branch / Tag:
refs/tags/v2.2.1 - Owner: https://github.com/elvezjp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@314860720859337cb76efe04a67ad4b74d1aba3c -
Trigger Event:
push
-
Statement type:
File details
Details for the file excel2md-2.2.1-py3-none-any.whl.
File metadata
- Download URL: excel2md-2.2.1-py3-none-any.whl
- Upload date:
- Size: 58.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d1937d4438777cee1b6af479d452a9676272a23cfd27669a65ad8cc2c4aae86
|
|
| MD5 |
3d54e9b670edd3376e4432b942681720
|
|
| BLAKE2b-256 |
52f41363b93f12fcf86c672dff0e0682182a1e0dd3070ce8d782635f45452a00
|
Provenance
The following attestation bundles were made for excel2md-2.2.1-py3-none-any.whl:
Publisher:
publish.yml on elvezjp/excel2md
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
excel2md-2.2.1-py3-none-any.whl -
Subject digest:
2d1937d4438777cee1b6af479d452a9676272a23cfd27669a65ad8cc2c4aae86 - Sigstore transparency entry: 1533403567
- Sigstore integration time:
-
Permalink:
elvezjp/excel2md@314860720859337cb76efe04a67ad4b74d1aba3c -
Branch / Tag:
refs/tags/v2.2.1 - Owner: https://github.com/elvezjp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@314860720859337cb76efe04a67ad4b74d1aba3c -
Trigger Event:
push
-
Statement type: