Skip to main content

GroupDocs.Comparison for Python via .NET - Compare documents and detect differences

Project description

banner

PyPI PyPI - Python Version

Product Page | Docs | Demos | API Reference | Blog | Free Support | Temporary License

GroupDocs.Comparison for Python via .NET is a document comparison API that detects text, style, and formatting differences between two or more documents and produces a single result file with the differences marked up. It supports DOCX, PDF, XLSX, PPTX, ODT, HTML, TXT, images, and more — with full control over sensitivity, change styling, summary pages, and accept/reject workflows.

Get Started

pip install groupdocs-comparison-net
from groupdocs.comparison import Comparer

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    comparer.compare("result.docx")

How It Works

The package is a self-contained Python wheel (~140 MB) that includes everything needed to compare documents. No external software installation is required - just pip install and start comparing. The wheel works across Python 3.5 - 3.14 on Windows, Linux, and macOS (Intel + Apple Silicon).

Features

  • Wide Format Support: DOCX, PDF, XLSX, PPTX, ODT, HTML, TXT, images, and many more.
  • Granular Change Detection: Text, style, formatting, table, bookmark, and revision changes — each with coordinates.
  • Accept / Reject Workflow: Iterate detected changes, mark each as accepted or rejected, and re-emit the resultant document.
  • Multi-Source Comparison: Compare one source against multiple targets in a single pass.
  • Password-Protected Documents: Open protected sources/targets and optionally re-protect the output.
  • Tunable Comparison Options: Sensitivity, style customization, summary pages, header/footer toggles, calculate-coordinates mode, and more.
  • No External Dependencies: No Microsoft Office or other software required.
  • Cross-Platform: Windows x64/x86, Linux x64, macOS x64/ARM64.

Supported File Formats

For a complete list, see supported formats.

  • Microsoft Office (Word, Excel, PowerPoint, Visio)
  • PDF
  • OpenDocument (ODT, ODS, ODP)
  • Web (HTML, MHTML)
  • Text/Markup (TXT, HTML)
  • Images (PNG, JPG, BMP, TIFF)
  • Email (EML, MSG)
  • AutoCAD (DWG, DXF)

Examples

Compare Two Documents

from groupdocs.comparison import Comparer

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    comparer.compare("result.docx")

Compare with Options

from groupdocs.comparison import Comparer
from groupdocs.comparison.options import CompareOptions

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    options = CompareOptions()
    options.sensitivity_of_comparison = 75
    options.detect_style_changes = True
    options.generate_summary_page = True
    comparer.compare("result.docx", options)

Get Detected Changes

from groupdocs.comparison import Comparer

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    comparer.compare()
    for change in comparer.get_changes():
        print(f"[{change.type}] {change.text}")

Accept / Reject Individual Changes

from groupdocs.comparison import Comparer
from groupdocs.comparison.options import ApplyChangeOptions
from groupdocs.comparison.result import ComparisonAction

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    comparer.compare()
    changes = comparer.get_changes()
    # Reject the first change, accept the rest
    changes[0].comparison_action = ComparisonAction.REJECT
    apply_opts = ApplyChangeOptions()
    apply_opts.changes = changes
    comparer.apply_changes("result.docx", apply_opts)

Compare Multiple Targets

from groupdocs.comparison import Comparer

with Comparer("source.docx") as comparer:
    comparer.add("target1.docx")
    comparer.add("target2.docx")
    comparer.add("target3.docx")
    comparer.compare("result.docx")

Password-Protected Documents

from groupdocs.comparison import Comparer
from groupdocs.comparison.options import LoadOptions, SaveOptions

src_load = LoadOptions(); src_load.password = "1234"
tgt_load = LoadOptions(); tgt_load.password = "5678"
save_opts = SaveOptions(); save_opts.password = "out-secret"

with Comparer("protected_source.docx", load_options=src_load) as comparer:
    comparer.add("protected_target.docx", load_options=tgt_load)
    comparer.compare("result.docx", save_options=save_opts)

Compare from Stream / BytesIO

import io
from groupdocs.comparison import Comparer

with open("source.docx", "rb") as src, open("target.docx", "rb") as tgt:
    with Comparer(src) as comparer:
        comparer.add(tgt)
        comparer.compare("result.docx")

src_buf = io.BytesIO(source_bytes)
tgt_buf = io.BytesIO(target_bytes)
with Comparer(src_buf) as comparer:
    comparer.add(tgt_buf)
    comparer.compare("result.docx")

Get Document Info

from groupdocs.comparison import Comparer

with Comparer("document.docx") as comparer:
    info = comparer.source.get_document_info()
    print(f"Pages: {info.page_count}, Size: {info.size}")

Customise Change Styling

from groupdocs.comparison import Comparer, Color
from groupdocs.comparison.options import CompareOptions, StyleSettings

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    options = CompareOptions()
    options.inserted_item_style = StyleSettings()
    options.inserted_item_style.font_color = Color.from_name("firebrick")
    options.deleted_item_style = StyleSettings()
    options.deleted_item_style.font_color = (200, 100, 50)   # RGB tuple
    options.changed_item_style = StyleSettings()
    options.changed_item_style.font_color = "#0000FF"        # hex string
    comparer.compare("result.docx", options)

StyleSettings colour properties accept a Color, an RGB/RGBA tuple, a packed-ARGB int, a '#RRGGBB'/'#AARRGGBB' hex string, or a named colour string ("red", "firebrick", ...).

Render Page Previews

from groupdocs.comparison import Comparer
from groupdocs.comparison.options import PreviewOptions, PreviewFormats

def create_page_stream(page_number):
    return open(f"page-{page_number}.png", "wb")

with Comparer("source.docx") as comparer:
    preview = PreviewOptions(create_page_stream)
    preview.preview_format = PreviewFormats.PNG
    preview.page_numbers = [1, 2, 3]
    comparer.source.generate_preview(preview)

Get Change Coordinates

from groupdocs.comparison import Comparer
from groupdocs.comparison.options import CompareOptions

with Comparer("source.docx") as comparer:
    comparer.add("target.docx")
    options = CompareOptions()
    options.calculate_coordinates = True
    comparer.compare("result.docx", options)
    for change in comparer.get_changes():
        b = change.box
        print(f"({b.x:.0f}, {b.y:.0f}) {b.width:.0f}x{b.height:.0f}: {change.text!r}")

Command Line

Installing the wheel also puts groupdocs-comparison on PATH.

# Compare two documents (output format is inferred from the extension)
groupdocs-comparison compare source.docx target.docx result.docx

# Tweak the comparison
groupdocs-comparison compare source.docx target.docx result.docx \
    --sensitivity 75 --detect-style-changes --generate-summary-page

# Password-protected sources / output
groupdocs-comparison compare src.docx tgt.docx out.docx --password "1234"
groupdocs-comparison compare src.docx tgt.docx out.docx --output-password "secret"

# Inspect a document
groupdocs-comparison info source.docx

# List every file type the engine recognises
groupdocs-comparison list-formats

# Apply a license up-front
groupdocs-comparison --license license.lic compare a.docx b.docx out.docx

# Equivalent module form
python -m groupdocs.comparison --version

Exit codes: 0 success, 2 user error (missing input), 1 runtime error.

The CLI covers single-pair comparison, document info, and the format catalogue. For workflows that need callbacks, in-memory streams, multi-target comparison in one Comparer, or accept/reject-individual-changes flows, drop into the Python API.

AI Agent & LLM Friendly

This package is designed for seamless integration with AI agents, LLMs, and automated code generation tools.

  • AGENTS.md in the package — AI coding assistants (Claude Code, Cursor, GitHub Copilot) auto-discover the API surface, usage patterns, and troubleshooting tips from the installed package
  • MCP server — connect your AI tool to GroupDocs documentation for on-demand API lookups:
    { "mcpServers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } }
    
  • Machine-readable docs — full documentation available as plain text for RAG and LLM context:
    • Single file: https://docs.groupdocs.com/comparison/python-net/llms-full.txt
    • Per page: append .md to any docs URL

Evaluation Mode

The API works without a license in evaluation mode with the following limitations:

  • An evaluation watermark is added to output documents.
  • A page / document-count cap applies.

To remove these limitations, apply a license or request a temporary license:

from groupdocs.comparison import License
License().set_license("path/to/license.lic")

Or set the environment variable (auto-applied at import):

export GROUPDOCS_LIC_PATH="path/to/license.lic"

Troubleshooting

Issue Platform Fix
System.Drawing.Common is not supported Linux/macOS apt-get install libgdiplus (Linux) or brew install mono-libgdiplus (macOS)
Garbled text or missing fonts in PDF Linux apt-get install ttf-mscorefonts-installer fontconfig && fc-cache -f
The type initializer for 'Gdip' threw an exception (Intel macOS / Linux) macOS x64 / Linux brew install mono-libgdiplus (macOS) / apt install libgdiplus (Linux)
Gdip initializer failing on macOS Apple Silicon when comparing PDF or images macOS arm64 Known upstream-side issue: comparison's PDF/image diff path uses System.Drawing.Common, which doesn't reliably load libgdiplus on mac-arm64 even when installed. DOCX/TXT/HTML/PPTX compares are unaffected. Workaround: target a non-rasterized output (DOCX/HTML), or run on macOS x64 / Linux / Windows. Engine-side fix pending.
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT errors Linux Do NOT set this variable. ICU must be available.
DllNotFoundException: libSkiaSharp macOS A stale system copy can shadow the bundled lib. Rename it: sudo mv /usr/local/lib/libSkiaSharp.dylib /usr/local/lib/libSkiaSharp.dylib.bak

System Requirements

  • Python 3.5 - 3.14
  • Windows x64/x86, Linux x64, macOS x64/ARM64

More Resources

Also available for other platforms: .NET | Java | Node.js


Product Page | Docs | Demos | API Reference | Blog | Free Support | Temporary License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

groupdocs_comparison_net-26.5.0-py3-none-win_amd64.whl (147.2 MB view details)

Uploaded Python 3Windows x86-64

groupdocs_comparison_net-26.5.0-py3-none-macosx_11_0_arm64.whl (146.4 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

groupdocs_comparison_net-26.5.0-py3-none-macosx_10_14_x86_64.whl (148.6 MB view details)

Uploaded Python 3macOS 10.14+ x86-64

File details

Details for the file groupdocs_comparison_net-26.5.0-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for groupdocs_comparison_net-26.5.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 eb9b24fafc1c7702343232c31a99aebba38520c191098aefec20396a4ff2934d
MD5 7b9cca77c6211e9db1fa3e228130b69c
BLAKE2b-256 c440b87d999aacec49378b131fac292010a9ccf68e1a554686fc5a0c46910796

See more details on using hashes here.

File details

Details for the file groupdocs_comparison_net-26.5.0-py3-none-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for groupdocs_comparison_net-26.5.0-py3-none-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 26035c69bd4eecb58abd5269a65387d6a3f0da736d4cb6581f4acf9294583147
MD5 9490c48ab9169cb38b2dbc8d8454eeb1
BLAKE2b-256 04c5535b72595e2216554bf435b26ab6a409fbd3e602ac66a1b68fd4f0420b72

See more details on using hashes here.

File details

Details for the file groupdocs_comparison_net-26.5.0-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for groupdocs_comparison_net-26.5.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 922237f043373f4e6cfad74ac47e159d8f254ce05811cfa6fc129ed0161b98af
MD5 bdb366d8dcd42ebc07e98351f0b7bffa
BLAKE2b-256 11a0d339f3586c51501f07f4bcd83c19bed28022de9a67b5c5daed15ca5af0a6

See more details on using hashes here.

File details

Details for the file groupdocs_comparison_net-26.5.0-py3-none-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for groupdocs_comparison_net-26.5.0-py3-none-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 35d83d03d4dc16da985f3c35c7229f091b37300784fbd30578b71e903b958e73
MD5 c3f6962650ca4700d0f7012cc183f022
BLAKE2b-256 f879c8513710d1cdcf04a373dde9046b6059ef9aeb0c098df3cc687de171fe83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page