Skip to main content

Lightweight free open-source alternative to Aspose.Words — convert DOCX, DOC, RTF to Markdown, text, and PDF

Project description

Aspose.Words FOSS

A lightweight, open-source Python library for converting DOCX, DOC, RTF, TXT, and MD files to DOCX, Markdown, plain text, and PDF without requiring Microsoft Word.

A free, lightweight version of Aspose.Words for Python via .NET with a compatible API (Document, SaveFormat, SaveOptions).

Python License: MIT

Features

  • DOCX Read/Write: Pure Python reader using only the standard library (zipfile, xml.etree)
  • DOC Support: Word 97-2003 binary format reader via olefile
  • RTF Support: Rich Text Format reader via OLE2 delegation
  • Plain Text & Markdown Input: Read .txt and .md files
  • Markdown Export: Rich formatting — headings, bold/italic/strikethrough/underline, ordered and unordered lists (including nested), tables, block quotes, code blocks, and hyperlinks. Encoding and paragraph break sequence are configurable
  • PDF Export: Generate PDF output via fpdf2. Applied PdfSaveOptions fields: compliance, image_compression, jpeg_quality, outline_options, export_document_structure, export_bookmarks_outline, zoom_behavior, zoom_factor, display_doc_title
  • Plain Text Export: Extract document text content

Installation

From PyPI:

pip install aspose-words-foss

Nightly (latest from GitHub):

pip install git+https://github.com/aspose-words-foss/Aspose.Words-FOSS-for-Python.git

Quick Start

Convert a document to Markdown

import aspose.words_foss as aw

doc = aw.Document("input.docx")  # or .doc, .rtf, .txt, .md
doc.save("output.md", aw.SaveFormat.MARKDOWN)

Export to PDF

import aspose.words_foss as aw

doc = aw.Document("input.docx")
doc.save("output.pdf", aw.SaveFormat.PDF)

Export to DOCX

import aspose.words_foss as aw

doc = aw.Document("input.docx")  # or .doc, .rtf
doc.save("output.docx", aw.SaveFormat.DOCX)

Extract plain text

import aspose.words_foss as aw

doc = aw.Document("input.docx")
text = doc.get_text()

Save with options

import aspose.words_foss as aw
from aspose.words_foss.saving import (
    MarkdownSaveOptions,
    OoxmlSaveOptions,
    PdfSaveOptions,
    CompressionLevel,
)

doc = aw.Document("input.docx")

# Markdown: underline, encoding, paragraph break
md_opts = MarkdownSaveOptions()
md_opts.export_underline_formatting = True
md_opts.encoding = "utf-8-sig"        # write a UTF-8 BOM
md_opts.paragraph_break = "\r\n"      # CRLF between paragraphs
doc.save("output.md", md_opts)

# DOCX: compression level
ooxml_opts = OoxmlSaveOptions()
ooxml_opts.compression_level = CompressionLevel.MAXIMUM
doc.save("output.docx", ooxml_opts)

pdf_opts = PdfSaveOptions()
doc.save("output.pdf", pdf_opts)

Requirements

  • Python 3.10 or higher
  • olefile >= 0.46
  • fpdf2 >= 2.7.0
  • pydantic >= 2.0.0

API Examples

Runnable examples demonstrating the aspose.words_foss API: ApiExamples folder

Files

File What it shows
convert_document.py Every input format (DOCX, DOC, RTF, TXT, MD) to every output format (Markdown, PDF, TXT)
working_with_markdown_save_options.py MarkdownSaveOptionsexport_underline_formatting, encoding, paragraph_break
working_with_ooxml_save_options.py OoxmlSaveOptions for DOCX export — pretty_format, compression_level
working_with_pdf_save_options.py PDF export from all input formats
working_with_txt_save_options.py Plain-text export and get_text()
working_with_images.py Image-containing documents to all output formats

Running

# Individual scripts
python ApiExamples/convert_document.py

# All via pytest
python -m pytest ApiExamples/ -v --rootdir=ApiExamples -c ApiExamples/pytest.ini

Input / Output

  • Input: tests/data/input/ (shared test fixtures)
  • Output: ApiExamples/output/ (git-ignored)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aspose_words_foss-26.5.0.tar.gz (559.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aspose_words_foss-26.5.0-py3-none-any.whl (266.5 kB view details)

Uploaded Python 3

File details

Details for the file aspose_words_foss-26.5.0.tar.gz.

File metadata

  • Download URL: aspose_words_foss-26.5.0.tar.gz
  • Upload date:
  • Size: 559.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aspose_words_foss-26.5.0.tar.gz
Algorithm Hash digest
SHA256 09ce50ae693257b347d72587ee65ad4812a076dcbc81123feb3c6533a4c34477
MD5 d7bdf96e0a0fd5604191c7c20e75ea13
BLAKE2b-256 062525453955825080f86c0807a01fa5a0e6b467d54a0361136950cd9d51e899

See more details on using hashes here.

File details

Details for the file aspose_words_foss-26.5.0-py3-none-any.whl.

File metadata

  • Download URL: aspose_words_foss-26.5.0-py3-none-any.whl
  • Upload date:
  • Size: 266.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aspose_words_foss-26.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b0a4624f2f992ffbaafb8f87c40d2230f24c713ae2ea118910f2124e2b4eccd9
MD5 dbedf45c6b5932dd3240313f9c143b20
BLAKE2b-256 0c3b95d072ac3f6e8cf73d1dac5b36bb8a689475fef109a341d5597d187e5566

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page