Safe local document-to-markdown preprocessing for OpenClaw, Claude Code, Codex, Hermes, and other agents.

These details have not been verified by PyPI

Project links

Project description

agent-markitdown

Safe local document-to-markdown preprocessing for agents.

Built for OpenClaw first, but intentionally usable from Claude Code, Codex, Hermes Agent, and anything else that can run a local CLI or Python package.

What it is

agent-markitdown wraps Microsoft's excellent markitdown with an agent-oriented safety and workflow layer:

local files only
convert_local() only
plugins off by default
extension allowlist
size guardrail
deterministic JSON output
extraction warnings when markdown may be incomplete
review-pack generation for LLM handoff

Why this exists

Raw file uploads are awkward for agent workflows.

For supported document types, agents usually work better when they receive clean markdown instead of a binary attachment or a heavyweight vision/PDF pass.

That means:

lower context overhead
easier quoting and summarization
better portability across agent runtimes
safer, narrower preprocessing than raw markitdown convert()

What it is not

This package does not magically patch every agent runtime on earth.

It gives you a safe preprocessing layer plus integration assets. Each host agent still needs a tiny adapter or instruction layer telling it to run agent-markitdown before review.

OpenClaw gets a ready-made skill. Other agents get drop-in snippets.

Status

GitHub repo: live
CI/release workflows: included
PyPI publish path: ready once a token or trusted publisher is configured

Installation

uv venv .venv
uv pip install --python .venv/bin/python .
# or with test/dev dependencies
uv pip install --python .venv/bin/python '.[dev]'

Or from PyPI later:

pip install agent-markitdown

CLI

Convert one file to stdout

agent-markitdown convert ./report.pdf

Convert and emit JSON

agent-markitdown convert ./report.docx --json

JSON output includes a warnings array. It is empty for ordinary text extraction, and it calls out cases where the markdown should not be treated as complete, such as very low extracted text or image inputs that may need OCR/vision review.

Write sidecar markdown files

agent-markitdown convert ./report.pdf ./notes.docx --sidecar

Build one review bundle for an agent

agent-markitdown review-pack ./report.pdf ./notes.docx -o review-pack.md

Health check

agent-markitdown doctor

Supported extensions

.pdf
.docx
.pptx
.xlsx
.xls
.html, .htm
.csv, .tsv
.json, .xml
.txt, .md, .rtf
.epub
.jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .webp

OpenClaw

See integrations/openclaw/SKILL.md.

That skill tells OpenClaw to preprocess supported uploaded documents into markdown before deeper review/summarization work.

Install the OpenClaw skill into a workspace:

./scripts/install-openclaw-skill.sh

Other agents

Claude Code: integrations/claude-code/AGENTS.md
Codex: integrations/codex/AGENTS.md
Hermes Agent: integrations/hermes-agent/SKILL.md

For copyable host-side patterns, see:

examples/review-pack-consumers/ for a generic review-pack handoff
examples/auto-preprocess-adapters/ for profile-specific prompt adapters that can sit in front of agent CLIs

Security stance

This package intentionally avoids the broadest markitdown surfaces.

no remote URLs
no convert()
no plugins unless explicitly enabled
no ZIP traversal support
explicit extension allowlist
configurable size cap
warnings for low-text extraction and image inputs that may need OCR/vision

If you're handling untrusted uploads in a server context, keep validating paths and storing uploads in a controlled temp area. This package narrows the blast radius; it does not replace sane host hygiene.

Release flow

CI runs on push/PR
release workflow runs on v* tags
tagged releases build wheel + sdist and attach them to a GitHub release
PyPI publish is attempted automatically when either:
- PYPI_API_TOKEN repo secret exists, or
- PYPI_TRUSTED_PUBLISHING=true repo variable is set and PyPI trusted publishing is configured

See docs/publishing.md and docs/release-checklist.md.

Attribution

This project depends on and is inspired by Microsoft's markitdown, which is MIT licensed.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_markitdown-0.1.1.tar.gz (102.1 kB view details)

Uploaded Jun 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_markitdown-0.1.1-py3-none-any.whl (9.2 kB view details)

Uploaded Jun 20, 2026 Python 3

File details

Details for the file agent_markitdown-0.1.1.tar.gz.

File metadata

Download URL: agent_markitdown-0.1.1.tar.gz
Upload date: Jun 20, 2026
Size: 102.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_markitdown-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`29cfbf8f8ed7b6997860e62e8d42c612bcb180086d81a50eacbd2261c2d4ec72`
MD5	`3b1ee176a86ad497d56a61eebc4bfd35`
BLAKE2b-256	`2c60cf54d90c26765f36d5f69654842a0f3eaed671f6a15c01728a498a20f8c8`

See more details on using hashes here.

File details

Details for the file agent_markitdown-0.1.1-py3-none-any.whl.

File metadata

Download URL: agent_markitdown-0.1.1-py3-none-any.whl
Upload date: Jun 20, 2026
Size: 9.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_markitdown-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6f6074f400472f55b5b1e411fc4e5c9050b5da63e1b0eb4679d8032c1138b54c`
MD5	`f7e269adcfa4845adcdac924d2206e01`
BLAKE2b-256	`4a44210a23fe0c7a1e13cac8c1f9cf2658b54c9b85044c113f332c500e007e50`

See more details on using hashes here.

agent-markitdown 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agent-markitdown

What it is

Why this exists

What it is not

Status

Installation

CLI

Convert one file to stdout

Convert and emit JSON

Write sidecar markdown files

Build one review bundle for an agent

Health check

Supported extensions

OpenClaw

Other agents

Security stance

Release flow

Attribution

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes