Lightweight free open-source alternative to Aspose.Words — convert DOCX, DOC, RTF to Markdown, text, and PDF
Project description
Aspose.Words FOSS
A lightweight, open-source Python library for converting DOCX, DOC, RTF, TXT, and MD files to DOCX, Markdown, plain text, and PDF without requiring Microsoft Word.
A free, lightweight version of Aspose.Words for Python via .NET with a compatible API (Document, SaveFormat, SaveOptions).
Features
- DOCX Read/Write: Pure Python reader using only the standard library (
zipfile,xml.etree) - DOC Support: Word 97-2003 binary format reader via
olefile - RTF Support: Rich Text Format reader via OLE2 delegation
- Plain Text & Markdown Input: Read
.txtand.mdfiles - Markdown Export: Rich formatting — headings, bold/italic/strikethrough/underline, ordered and unordered lists (including nested), tables, block quotes, code blocks, and hyperlinks. Encoding and paragraph break sequence are configurable
- PDF Export: Generate PDF output via
fpdf2. AppliedPdfSaveOptionsfields:compliance,image_compression,jpeg_quality,outline_options,export_document_structure,export_bookmarks_outline,zoom_behavior,zoom_factor,display_doc_title - Plain Text Export: Extract document text content
Installation
From PyPI:
pip install aspose-words-foss
Nightly (latest from GitHub):
pip install git+https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python.git
Quick Start
Convert a document to Markdown
import aspose.words_foss as aw
doc = aw.Document("input.docx") # or .doc, .rtf, .txt, .md
doc.save("output.md", aw.SaveFormat.MARKDOWN)
Export to PDF
import aspose.words_foss as aw
doc = aw.Document("input.docx")
doc.save("output.pdf", aw.SaveFormat.PDF)
Export to DOCX
import aspose.words_foss as aw
doc = aw.Document("input.docx") # or .doc, .rtf
doc.save("output.docx", aw.SaveFormat.DOCX)
Extract plain text
import aspose.words_foss as aw
doc = aw.Document("input.docx")
text = doc.get_text()
Save with options
import aspose.words_foss as aw
from aspose.words_foss.saving import (
MarkdownSaveOptions,
OoxmlSaveOptions,
PdfSaveOptions,
CompressionLevel,
)
doc = aw.Document("input.docx")
# Markdown: underline, encoding, paragraph break
md_opts = MarkdownSaveOptions()
md_opts.export_underline_formatting = True
md_opts.encoding = "utf-8-sig" # write a UTF-8 BOM
md_opts.paragraph_break = "\r\n" # CRLF between paragraphs
doc.save("output.md", md_opts)
# DOCX: compression level
ooxml_opts = OoxmlSaveOptions()
ooxml_opts.compression_level = CompressionLevel.MAXIMUM
doc.save("output.docx", ooxml_opts)
pdf_opts = PdfSaveOptions()
doc.save("output.pdf", pdf_opts)
Requirements
- Python 3.10 or higher
- olefile >= 0.46
- fpdf2 >= 2.7.0
- pydantic >= 2.0.0
API Examples
Runnable examples demonstrating the aspose.words_foss API:
ApiExamples folder
Files
| File | What it shows |
|---|---|
convert_document.py |
Every input format (DOCX, DOC, RTF, TXT, MD) to every output format (Markdown, PDF, TXT) |
working_with_markdown_save_options.py |
MarkdownSaveOptions — export_underline_formatting, encoding, paragraph_break |
working_with_ooxml_save_options.py |
OoxmlSaveOptions for DOCX export — pretty_format, compression_level |
working_with_pdf_save_options.py |
PDF export from all input formats |
working_with_txt_save_options.py |
Plain-text export and get_text() |
working_with_images.py |
Image-containing documents to all output formats |
Running
# Individual scripts
python ApiExamples/convert_document.py
# All via pytest
python -m pytest ApiExamples/ -v --rootdir=ApiExamples -c ApiExamples/pytest.ini
Input / Output
- Input:
tests/data/input/(shared test fixtures) - Output:
ApiExamples/output/(git-ignored)
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Issues: GitHub Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aspose_words_foss-26.5.0.tar.gz.
File metadata
- Download URL: aspose_words_foss-26.5.0.tar.gz
- Upload date:
- Size: 559.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09ce50ae693257b347d72587ee65ad4812a076dcbc81123feb3c6533a4c34477
|
|
| MD5 |
d7bdf96e0a0fd5604191c7c20e75ea13
|
|
| BLAKE2b-256 |
062525453955825080f86c0807a01fa5a0e6b467d54a0361136950cd9d51e899
|
File details
Details for the file aspose_words_foss-26.5.0-py3-none-any.whl.
File metadata
- Download URL: aspose_words_foss-26.5.0-py3-none-any.whl
- Upload date:
- Size: 266.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0a4624f2f992ffbaafb8f87c40d2230f24c713ae2ea118910f2124e2b4eccd9
|
|
| MD5 |
dbedf45c6b5932dd3240313f9c143b20
|
|
| BLAKE2b-256 |
0c3b95d072ac3f6e8cf73d1dac5b36bb8a689475fef109a341d5597d187e5566
|