Convert .docx and .xlsx to Markdown with images extracted (optional .doc -> .docx via pywin32 on Windows)
Project description
convert-documents-skill (npm wrapper)
这是一个 Python 实现的文档转换工具(以 PyPI 为发布目标)。它将 .docx 转为 Markdown,并将 .xlsx 转为包含每个 Sheet 的 Markdown。输出会包含同名文件夹、.md 文件以及 _images/ 图片文件夹。
使用(推荐 Python 方式)
-
在虚拟环境中安装依赖:
python -m venv .venv .venv\Scripts\activate # Windows pip install -r requirements.txt
-
运行转换:
python convert_documents_skill.py path/to/file.docx
说明:
- 支持 .docx(默认)和 .xlsx(Sheet -> Markdown)。
- 额外支持:.doc(需要 Windows + MS Word + pywin32)。脚本会尝试通过 COM 将 .doc 转为 .docx 后再处理;若未安装 pywin32 或 MS Word,会输出友好的报错信息。
输出:在源文件同级目录生成一个同名文件夹,里面包含 .md 和 _images/。
注意事项
- 若要转换 .doc(旧格式),请在 Windows 上安装 Microsoft Word 并 pip install pywin32。脚本会在转换失败时打印明确错误。
- 如果你的文档很复杂,mammoth 的转换结果可能需要人工校对。
Publishing
To publish to npm (wrapper package):
- Update package.json repository, author, and version fields.
- Run npm publish --access public (or with your scoped registry).
To publish to PyPI:
- Ensure pyproject.toml and setup.cfg are updated with your metadata.
- Build distributions: python -m build
- Upload: python -m twine upload dist/*
Notes for OpenCode skill usage
- The npm package is a convenience wrapper that calls the Python script. For an OpenCode skill, package the Python code as a pip package and publish on PyPI, or publish the npm wrapper if the platform prefers npm.
- The script requires external binaries for .doc/.xls conversion on Windows (MS Office). For server-side usage, prefer .docx/.xlsx inputs.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file convert_documents_skill-0.1.0.tar.gz.
File metadata
- Download URL: convert_documents_skill-0.1.0.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
473274705bfc564d0fdbef3df4837efddee7e77a95df1b160a3465a730cdbefc
|
|
| MD5 |
51a53ad8f4a83df73246321f163fdece
|
|
| BLAKE2b-256 |
677eb4e5712ea084ad9a8e132e5776aaeccd8c5e7b1cafc2eec3271a482b532e
|
File details
Details for the file convert_documents_skill-0.1.0-py3-none-any.whl.
File metadata
- Download URL: convert_documents_skill-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a5147fe62d4753d040a58b68dfc600664f2294cdc992b0e0a76125d9b4ad9b9
|
|
| MD5 |
4d1e3e3b43a5de974f37a3aa9498c1ed
|
|
| BLAKE2b-256 |
0ea32a57f4a045a7e782a4a9cf7ab986462812a63b2d2fbbe8f6091706427bfa
|