Skip to main content

Convert .docx and .xlsx to Markdown with images extracted (optional .doc -> .docx via pywin32 on Windows)

Project description

convert-documents-skill (npm wrapper)

这是一个 Python 实现的文档转换工具(以 PyPI 为发布目标)。它将 .docx 转为 Markdown,并将 .xlsx 转为包含每个 Sheet 的 Markdown。输出会包含同名文件夹、.md 文件以及 _images/ 图片文件夹。

使用(推荐 Python 方式)

  1. 在虚拟环境中安装依赖:

    python -m venv .venv .venv\Scripts\activate # Windows pip install -r requirements.txt

  2. 运行转换:

    python convert_documents_skill.py path/to/file.docx

说明:

  • 支持 .docx(默认)和 .xlsx(Sheet -> Markdown)。
  • 额外支持:.doc(需要 Windows + MS Word + pywin32)。脚本会尝试通过 COM 将 .doc 转为 .docx 后再处理;若未安装 pywin32 或 MS Word,会输出友好的报错信息。

输出:在源文件同级目录生成一个同名文件夹,里面包含 .md 和 _images/。

注意事项

  • 若要转换 .doc(旧格式),请在 Windows 上安装 Microsoft Word 并 pip install pywin32。脚本会在转换失败时打印明确错误。
  • 如果你的文档很复杂,mammoth 的转换结果可能需要人工校对。

Publishing

To publish to npm (wrapper package):

  1. Update package.json repository, author, and version fields.
  2. Run npm publish --access public (or with your scoped registry).

To publish to PyPI:

  1. Ensure pyproject.toml and setup.cfg are updated with your metadata.
  2. Build distributions: python -m build
  3. Upload: python -m twine upload dist/*

Notes for OpenCode skill usage

  • The npm package is a convenience wrapper that calls the Python script. For an OpenCode skill, package the Python code as a pip package and publish on PyPI, or publish the npm wrapper if the platform prefers npm.
  • The script requires external binaries for .doc/.xls conversion on Windows (MS Office). For server-side usage, prefer .docx/.xlsx inputs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

convert_documents_skill-0.1.0.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

convert_documents_skill-0.1.0-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file convert_documents_skill-0.1.0.tar.gz.

File metadata

  • Download URL: convert_documents_skill-0.1.0.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for convert_documents_skill-0.1.0.tar.gz
Algorithm Hash digest
SHA256 473274705bfc564d0fdbef3df4837efddee7e77a95df1b160a3465a730cdbefc
MD5 51a53ad8f4a83df73246321f163fdece
BLAKE2b-256 677eb4e5712ea084ad9a8e132e5776aaeccd8c5e7b1cafc2eec3271a482b532e

See more details on using hashes here.

File details

Details for the file convert_documents_skill-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for convert_documents_skill-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a5147fe62d4753d040a58b68dfc600664f2294cdc992b0e0a76125d9b4ad9b9
MD5 4d1e3e3b43a5de974f37a3aa9498c1ed
BLAKE2b-256 0ea32a57f4a045a7e782a4a9cf7ab986462812a63b2d2fbbe8f6091706427bfa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page