Excel to Markdown converter with CSV markdown output support
Project description
excel2md
Excel to Markdown converter. Reads Excel workbooks (.xlsx/.xlsm) and automatically generates Markdown format output.
Features
- Smart Table Detection: Automatically detects Excel print areas and converts them to Markdown tables
- CSV Markdown Output: Exports entire sheets in CSV format with validation metadata
- Image Extraction: Extracts images from Excel files and outputs them as Markdown image links
- Mermaid Flowcharts: Generates Mermaid diagrams from Excel shapes and tables
- Hyperlink Support: Multiple output modes (inline, footnote, plain text)
- Split by Sheet: Generate individual files per sheet
- Customizable: Detailed settings for formatting, alignment, and data processing
Use Cases
- Document Generation: Convert Excel specifications to Markdown
- AI/LLM Processing: CSV markdown format optimized for token efficiency
- Flowchart Extraction: Extract diagrams from Excel shapes
- Data Migration: Export Excel data to portable Markdown format
- Version Control: Track Excel changes in text-based format
Documentation
- CHANGELOG.md - Version history
- CONTRIBUTING.md - Contribution guidelines
- SECURITY.md - Security policy and best practices
- v2.2.0/spec.md - Technical specification (v2.2.0, latest)
- v2.1.0/spec.md - Technical specification (v2.1.0, frozen snapshot)
- v1.8/spec.md - Technical specification (v1.8)
Setup
Requirements
- Python 3.10 or higher
- uv package manager
Install Dependencies
# Install uv (if not already installed)
# Details: https://docs.astral.sh/uv/getting-started/installation/
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
Usage
uv run python v2.2.0/excel_to_md.py input.xlsx
This generates:
input_csv.md: CSV markdown format (default)input_images/: Image directory (if images exist)
Note
- Output filenames and directories are based on input filename (e.g.,
input.xlsx→input_csv.md,input_images/) - Output is saved in the same directory as input file (use
--csv-output-dirto change)
Common Examples
Convert with Mermaid flowchart support:
uv run python v2.2.0/excel_to_md.py input.xlsx --mermaid-enabled
Generate individual files per sheet:
uv run python v2.2.0/excel_to_md.py input.xlsx --split-by-sheet
Specify CSV markdown output directory:
uv run python v2.2.0/excel_to_md.py input.xlsx --csv-output-dir ./output
# CSV markdown: ./output/input_csv.md
# Images: ./output/input_images/
Output standard Markdown only (no CSV output):
uv run python v2.2.0/excel_to_md.py input.xlsx -o output.md --no-csv-markdown-enabled
Plain text hyperlinks (no Markdown syntax):
uv run python v2.2.0/excel_to_md.py input.xlsx --hyperlink-mode inline_plain
Reduce token count (exclude CSV summary section):
uv run python v2.2.0/excel_to_md.py input.xlsx --no-csv-include-description
Key Options
Output Control
| Option | Default | Description |
|---|---|---|
--split-by-sheet |
false | Generate individual files per sheet |
--csv-markdown-enabled |
true | Enable CSV markdown output |
--csv-output-dir |
Same as input | Output directory for CSV markdown and images |
--csv-include-description |
true | Include summary section in CSV output |
--csv-include-metadata |
true | Include validation metadata in CSV output |
--image-extraction |
true | Enable image extraction |
-o, --output |
- | Output file path for standard Markdown |
Hyperlink Formats
| Mode | Description | Output Example |
|---|---|---|
inline |
Markdown format | [text](URL) |
inline_plain |
Plain text format | text (URL) |
footnote |
Footnote format | [text][^1] + [^1]: URL |
text_only |
Display text only | text |
both |
Inline + footnote | Both formats |
Mermaid Flowcharts
| Option | Default | Description |
|---|---|---|
--mermaid-enabled |
false | Enable Mermaid conversion |
--mermaid-detect-mode |
shapes | Detection mode: shapes, column_headers, heuristic |
--mermaid-direction |
TD | Flowchart direction: TD, LR, BT, RL |
--mermaid-keep-source-table |
true | Output original table along with Mermaid |
Table Processing
| Option | Default | Description |
|---|---|---|
--header-detection |
first_row | Treat first row as header |
--align-detection |
numbers_right | Right-align numeric columns |
--max-cells-per-table |
200000 | Maximum cells per table |
--no-print-area-mode |
used_range | Behavior when print area not set |
Output Examples
Standard Markdown Output
# Conversion Result: sample.xlsx
- Spec Version: 2.0
- Sheet Count: 2
- Sheet List: Sheet1, Summary
---
## Sheet1
### Table 1 (A1:C4)
| Item | Quantity | Notes |
| --- | ---: | --- |
| Apple | 10 | [Supplier](https://example.com)[^1] |
| Orange | 5 | |
[^1]: https://example.com
CSV Markdown Output
# CSV Output: sample.xlsx
## Summary
### File Information
- Original Excel filename: sample.xlsx
- Sheet count: 2
- Generated at: 2025-01-05 10:00:00
### About This File
This CSV markdown file is designed to help AI understand Excel content...
---
## Sheet1
```csv
Item,Quantity,Notes
Apple,10,Supplier
Orange,5,
```
---
## Validation Metadata
- **Generated at**: 2025-01-05 10:00:00
- **Original Excel file**: sample.xlsx
- **Validation status**: OK
Image Extraction
Images in Excel files are automatically processed:
-
Automatic Extraction: Images from each sheet are saved as external files
- Filename format:
{sheet_name}_img_{number}.{extension} - Example:
Sheet1_img_1.png,Sheet1_img_2.jpg
- Filename format:
-
Save Location: Output to same directory as CSV markdown
- Directory name:
{input_filename}_images/ - Example:
input.xlsx→input_images/directory - Use
--csv-output-diroption to change output location
- Directory name:
-
Markdown Links: Generates Markdown image links for cells with images
- Format:
 - Uses cell value as alt text if available
- Auto-generates alt text like
Image at A1if cell is empty
- Format:
-
Supported Formats: PNG, JPEG, GIF
Example:
If a company logo image is at cell position (B2):
- Image file: saved as
input_images/Sheet1_img_1.png - CSV output:
 - Cell text "Company Logo" is used as alt text
Advanced Options
List all options:
uv run python v2.2.0/excel_to_md.py --help
Key advanced options:
- Cell merge policy
- Date/number format control
- Whitespace handling
- Markdown escape level
- Hidden row/column policy
- Locale-specific formatting
Directory Structure
excel2md/
├── v2.2.0/ # Latest version
│ ├── excel_to_md.py # Entry point
│ ├── excel2md/ # Main package
│ ├── tests/ # Test suite
│ ├── spec.md # Specification
│ └── spec_appendix.md # Specification appendix
├── v2.1.1/ # Previous version (frozen snapshot, not published to PyPI)
├── v2.1.0/ # Previous version (frozen snapshot)
├── v2.0.1/ # Previous version
├── v2.0/ # Previous version
├── v1.8/ # Legacy version
│ ├── excel_to_md.py # Main conversion program
│ ├── spec.md # Specification
│ └── tests/ # Test suite
├── v1.7/ # Legacy version
│ ├── excel_to_md.py # Main conversion program
│ ├── spec.md # Specification
│ └── tests/ # Test suite
├── docs/ # Documentation
├── pyproject.toml # Project metadata
├── LICENSE # MIT License
├── README.md / _ja.md # README (English / Japanese)
├── CONTRIBUTING.md / _ja.md # Contribution guide (English / Japanese)
├── SECURITY.md / _ja.md # Security policy (English / Japanese)
└── CHANGELOG.md / _ja.md # Version history (English / Japanese)
Security
For security concerns, please see SECURITY.md.
Key security notes:
- Only process Excel files from trusted sources
- Use
read_only=Truemode to prevent file modification - Excel macros are not executed
- Sanitize Markdown output to prevent injection
Contributing
Contributions are welcome! See CONTRIBUTING.md for details.
- Report bugs via GitHub Issues
- Submit pull requests for improvements
- Follow existing code style
- Add tests for new features
Changelog
See CHANGELOG.md for details.
Background
This tool was created during the development of IXV, an AI development support tool targeting Japanese development documents and specifications.
IXV addresses challenges in understanding, structuring, and utilizing Japanese documents in system development. This repository publicly shares a portion of that work.
License
MIT License - See LICENSE for details.
Contact
- Email: info@elvez.co.jp
- Company: Elvez, Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file excel2md-2.2.0.tar.gz.
File metadata
- Download URL: excel2md-2.2.0.tar.gz
- Upload date:
- Size: 62.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9613a41c677c0696d32a5c8083ad9c31736d7a4e2c3f7142c327739166d96bbc
|
|
| MD5 |
5efddbba17d44381c043e1340a9744b3
|
|
| BLAKE2b-256 |
9318953d674839bb09032ad7a99760758c304cbbb4037da3d7df3a394ef2ebb9
|
Provenance
The following attestation bundles were made for excel2md-2.2.0.tar.gz:
Publisher:
publish.yml on elvezjp/excel2md
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
excel2md-2.2.0.tar.gz -
Subject digest:
9613a41c677c0696d32a5c8083ad9c31736d7a4e2c3f7142c327739166d96bbc - Sigstore transparency entry: 1522534618
- Sigstore integration time:
-
Permalink:
elvezjp/excel2md@3a7a683196d8888dd234edfca6718e74235a812a -
Branch / Tag:
refs/tags/v2.2.0 - Owner: https://github.com/elvezjp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3a7a683196d8888dd234edfca6718e74235a812a -
Trigger Event:
push
-
Statement type:
File details
Details for the file excel2md-2.2.0-py3-none-any.whl.
File metadata
- Download URL: excel2md-2.2.0-py3-none-any.whl
- Upload date:
- Size: 57.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
142e445b1252bc62243f929772f25abc81d1667808a2f4276e0e85afd7bcc824
|
|
| MD5 |
6e1e22d429ee7f20525c3de3aec83f26
|
|
| BLAKE2b-256 |
26450726838f8d0d50c9059515f25102da944693a954981507fd955fe760d873
|
Provenance
The following attestation bundles were made for excel2md-2.2.0-py3-none-any.whl:
Publisher:
publish.yml on elvezjp/excel2md
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
excel2md-2.2.0-py3-none-any.whl -
Subject digest:
142e445b1252bc62243f929772f25abc81d1667808a2f4276e0e85afd7bcc824 - Sigstore transparency entry: 1522534652
- Sigstore integration time:
-
Permalink:
elvezjp/excel2md@3a7a683196d8888dd234edfca6718e74235a812a -
Branch / Tag:
refs/tags/v2.2.0 - Owner: https://github.com/elvezjp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3a7a683196d8888dd234edfca6718e74235a812a -
Trigger Event:
push
-
Statement type: