Skip to main content

Browse and diff Microsoft Office .docx, .xlsx, and .pptx files.

Project description

opc-diag is a command-line application for exploring Microsoft Word, Excel, and PowerPoint files from Office 2007 and later. Also known as Office Open XML, the structure of these files adheres to the Open Packaging Convention (OPC), specified by ISO/IEC 29500.

opc-diag provides the opc command, which allows OPC files to be browsed, diff-ed, extracted, repackaged, and parts from one to be substituted into another.

Its primary use is by developers of software that generates and/or manipulates Microsoft Office documents.

A typical use would be diff-ing a Word file from before and after an operation, say inserting a paragraph, to identify the specific changes Word made to the XML. This is handy when one is developing software to do the same without Word’s help:

$ opc diff before.docx after.docx

Another main use is to diagnose an issue causing an Office document to not load cleanly, typically because the software that generated it has a bug. These problems can be tedious and often difficult to diagnose without tools like opc-diag, and were the primary motivation for developing it.

More information is available in the opc-diag documentation.

History

1.0.0 (2014-01-14)

  • Add pretty-printing of extracted XML on extract command

0.9.8 (2013-12-13)

  • hotfix – fix UnicodeEncodeError on output containing non-ASCII chars

0.9.7 (2013-09-23)

  • Initial release – supporting browse, diff, diff-item, extract, repackage, and substitute subcommands.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
opc-diag-1.0.0.tar.gz (17.1 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page