Convert PDF, DOCX, XLSX to Markdown via a simple CLI or Python API.
Project description
markpipe
Convert PDF, DOCX, and XLSX files to clean Markdown — one command, no API keys.
Built for developers and teams who feed documents into LLMs or RAG pipelines.
Installation
pip install markpipe
Or from source:
git clone https://github.com/keremnuman/markdown-pipeline.git
cd markdown-pipeline
pip install -e .
CLI
# Single file
doc2md report.pdf
# Entire folder
doc2md ./documents --output ./output_md
# Parallel workers
doc2md ./documents --workers 8
# With config file
doc2md --config config.yaml
Python API
from pathlib import Path
from doc2md import MicrosoftMarkItDownConverter, DocumentPipeline
converter = MicrosoftMarkItDownConverter()
pipeline = DocumentPipeline(converter=converter, output_dir=Path("./output"))
pipeline.process_single(Path("report.pdf")) # single file
pipeline.process_batch(Path("./documents")) # batch
Contributing
git clone https://github.com/keremnuman/markdown-pipeline.git
pip install -e ".[dev]"
pytest tests/ -v
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
markpipe-0.1.2.tar.gz
(6.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markpipe-0.1.2.tar.gz.
File metadata
- Download URL: markpipe-0.1.2.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa6f78884e9f2d6b8f0f9faf7b3f1e26343a460c32dfa144e57b8ef782a6fde2
|
|
| MD5 |
b2dff6caf4abc167c4de5b5747885948
|
|
| BLAKE2b-256 |
0412f1c99b03c2c779975d72ca687ef3d97dd799e4ae5af07f945524a4225702
|
File details
Details for the file markpipe-0.1.2-py3-none-any.whl.
File metadata
- Download URL: markpipe-0.1.2-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a81c977c8abfef6351461993a6ceac1d0e9fbec6266bf4797b832de49b7721aa
|
|
| MD5 |
76d55857d32acd40c282142abc342196
|
|
| BLAKE2b-256 |
eff153690ca89233f86131a3e72b2f8d74d041c13b2b9b964accdc80852ed3f7
|