Skip to main content

Decrypt password-protected OpenDocument Format (ODF) files from LibreOffice and Apache OpenOffice

Project description

odf-decrypt

PyPI version Python Versions License

A Python library for decrypting password-protected OpenDocument Format (ODF) files. Supports files created by both LibreOffice and Apache OpenOffice, handling modern and legacy encryption formats. Currently supports regular password encryption only - GPG key encryption is not yet implemented.

Features

  • Dual Application Support: Decrypt files from both LibreOffice and Apache OpenOffice
  • Modern Encryption: AES-256-GCM with Argon2id key derivation (LibreOffice)
  • Legacy Encryption: Blowfish-CFB with PBKDF2-SHA1 (Apache OpenOffice)
  • Automatic Detection: Identifies the source application and encryption format
  • All ODF Types: Supports .odt, .ods, .odp, .odg, .odf, and more
  • Command-Line Tool: Simple CLI for quick decryption tasks

Supported Encryption Formats

Application Algorithm Key Derivation Status
LibreOffice (modern) AES-256-GCM Argon2id Supported
LibreOffice (legacy) AES-256-CBC PBKDF2-SHA1 Supported
Apache OpenOffice Blowfish-CFB PBKDF2-SHA1 Supported

Installation

pip install odfdecrypt

Or with uv:

uv add odfdecrypt

Quick Start

High-Level API (Recommended)

The simplest way to decrypt ODF files is using the high-level API functions:

from odfdecrypt import decrypt, detect_origin

# Auto-detect origin and decrypt
decrypted_file = decrypt("document.odt", "password")

# Save the decrypted file
with open("decrypted.odt", "wb") as f:
    f.write(decrypted_file.read())

Working with File Objects

You can also work with file objects directly:

import io
from odfdecrypt import decrypt

# Decrypt from BytesIO
with open("document.odt", "rb") as f:
    file_buffer = io.BytesIO(f.read())

decrypted_buffer = decrypt(file_buffer, "password")

# Process the decrypted content without saving to disk
decrypted_content = decrypted_buffer.getvalue()

Origin Detection

Detect the source application before decryption:

from odfdecrypt import detect_origin, OpenOfficeOrigin

origin = detect_origin("document.odt")
print(f"Document created by: {origin}")

# Output: Document created by: OpenOfficeOrigin.LIBREOFFICE
# or: Document created by: OpenOfficeOrigin.APACHE_OPEN_OFFICE

Advanced Usage with Specific Decryptors

For more control, use specific decryptors:

from odfdecrypt import LibreOfficeDecryptor, AOODecryptor

# LibreOffice decryptor
libre_decryptor = LibreOfficeDecryptor()
decrypted_file = libre_decryptor.decrypt("document.odt", "password")

# Apache OpenOffice decryptor
aoo_decryptor = AOODecryptor()
decrypted_file = aoo_decryptor.decrypt("document.odt", "password")

Auto-Detection with Fallback

The high-level API automatically handles fallback scenarios:

from odfdecrypt import decrypt, IncorrectPasswordError

# This will try LibreOffice first, then Apache OpenOffice if needed
decrypted_file = decrypt("document.odt", "password")

# Works with unknown origins or BytesIO objects
try:
    decrypted = decrypt(io_buffer, "password")
except IncorrectPasswordError:
    print("Wrong password!")

Check if File is Encrypted

Before decryption, you can check if a file is encrypted:

from odfdecrypt import is_encrypted

if is_encrypted("document.odt"):
    print("File is password protected")
else:
    print("File is not encrypted")

API Reference

High-Level API Functions

decrypt(odf, password)

Auto-detects the ODF origin and decrypts the file using the appropriate method.

decrypt(odf: str | PathLike | io.BytesIO, password: str) -> io.BytesIO

Parameters:

  • odf: Path to ODF file or BytesIO object containing the encrypted file
  • password: Password to decrypt the file

Returns: io.BytesIO containing the decrypted ODF file

Features:

  • Auto-detects LibreOffice vs Apache OpenOffice files
  • Works with both file paths and BytesIO objects
  • Automatic fallback for unknown origins
  • Raises IncorrectPasswordError for wrong passwords

detect_origin(file_path)

Detects whether an ODF file was created by LibreOffice or Apache OpenOffice.

detect_origin(file_path: str | PathLike) -> OpenOfficeOrigin

Parameters:

  • file_path: Path to the ODF file

Returns: OpenOfficeOrigin enum value:

  • OpenOfficeOrigin.LIBREOFFICE
  • OpenOfficeOrigin.APACHE_OPEN_OFFICE
  • OpenOfficeOrigin.UNKNOWN

Utility Functions

is_encrypted(file_path)

Checks if an ODF file is password protected.

is_encrypted(file_path: str) -> bool

Parameters:

  • file_path: Path to the ODF file

Returns: True if encrypted, False otherwise

Specialized Decryptors

LibreOfficeDecryptor

Decrypts ODF files encrypted by LibreOffice (modern and legacy formats).

LibreOfficeDecryptor().decrypt(file_path: str, password: str) -> io.BytesIO
LibreOfficeDecryptor().decrypt_from_bytes(odf: io.BytesIO, password: str) -> io.BytesIO
LibreOfficeDecryptor().decrypt_from_file(odf_path: str, password: str) -> io.BytesIO

Supported Algorithms:

  • Modern: AES-256-GCM with Argon2id key derivation
  • Legacy: AES-256-CBC with PBKDF2-SHA1 key derivation

AOODecryptor

Decrypts ODF files encrypted by Apache OpenOffice.

AOODecryptor().decrypt(file_path: str, password: str) -> io.BytesIO
AOODecryptor().decrypt_from_bytes(odf: io.BytesIO, password: str) -> io.BytesIO
AOODecryptor().decrypt_from_file(odf_path: str, password: str) -> io.BytesIO

Supported Algorithm:

  • Blowfish-CFB with PBKDF2-SHA1 key derivation

ODFOriginDetector

Advanced origin detection with additional options.

ODFOriginDetector().detect_origin(file_path: str) -> OpenOfficeOrigin

Features:

  • Analyzes manifest.xml and document structure
  • Detects application-specific metadata and encryption signatures

Error Handling

The library provides specific exceptions for different error scenarios:

from odfdecrypt import (
    decrypt,
    IncorrectPasswordError,
    InvalidODFFileError,
    UnsupportedEncryptionError,
    ODFDecryptError
)

try:
    decrypted_file = decrypt("document.odt", "password")
except IncorrectPasswordError:
    print("Wrong password provided")
except InvalidODFFileError:
    print("File is not a valid ODF document")
except UnsupportedEncryptionError:
    print("Encryption format not supported")
except ODFDecryptError as e:
    print(f"Decryption failed: {e}")

Best Practices

Performance Tips

  • Reuse decryptors: Create decryptors once and reuse them for multiple files
  • Use BytesIO: For processing multiple files in memory, use BytesIO objects
  • Check encryption first: Use is_encrypted() to avoid unnecessary decryption attempts

Security Considerations

  • Password handling: Never hardcode passwords in your code
  • File validation: Always validate that files are legitimate ODF documents
  • Memory management: Properly close file handles and BytesIO objects
from odfdecrypt import decrypt, is_encrypted

# Efficient batch processing
def decrypt_documents(file_paths, password):
    results = []
    for file_path in file_paths:
        if is_encrypted(file_path):
            try:
                decrypted = decrypt(file_path, password)
                results.append(decrypted)
            except Exception as e:
                print(f"Failed to decrypt {file_path}: {e}")
    return results

Supported File Types

Extension Type
.odt Text Document
.ods Spreadsheet
.odp Presentation
.odg Drawing
.odf Formula
.odb Database
.odc Chart
.odm Master Document
.ott Text Template
.ots Spreadsheet Template
.otp Presentation Template
.otg Drawing Template

Requirements

  • Python 3.10+
  • cryptography
  • pycryptodome
  • argon2-cffi

Development

Setup

git clone https://github.com/toobee/odf-decrypt.git
cd odf-decrypt
uv sync --all-groups

Run tests

uv run pytest

Code formatting

uv run black .

Complete Examples

Example 1: Simple File Processing

from odfdecrypt import decrypt, is_encrypted

# Process a single file
input_file = "report.odt"
password = "secret123"

if is_encrypted(input_file):
    decrypted_data = decrypt(input_file, password)
    with open("report_decrypted.odt", "wb") as f:
        f.write(decrypted_data.read())
    print("File decrypted successfully")
else:
    print("File is not encrypted")

Example 2: Batch Processing

import os
from odfdecrypt import decrypt, is_encrypted, IncorrectPasswordError

def decrypt_folder(folder_path, password, output_folder="decrypted"):
    """Decrypt all ODF files in a folder."""
    os.makedirs(output_folder, exist_ok=True)

    for filename in os.listdir(folder_path):
        if filename.endswith(('.odt', '.ods', '.odp', '.odg')):
            input_path = os.path.join(folder_path, filename)
            output_path = os.path.join(output_folder, filename)

            if is_encrypted(input_path):
                try:
                    decrypted_data = decrypt(input_path, password)
                    with open(output_path, "wb") as f:
                        f.write(decrypted_data.read())
                    print(f"✓ Decrypted: {filename}")
                except IncorrectPasswordError:
                    print(f"✗ Wrong password for: {filename}")
                except Exception as e:
                    print(f"✗ Failed to decrypt {filename}: {e}")
            else:
                # Copy unencrypted files
                with open(input_path, "rb") as src, open(output_path, "wb") as dst:
                    dst.write(src.read())
                print(f"- Copied (not encrypted): {filename}")

# Usage
decrypt_folder("documents/", "mypassword")

Example 3: Working with In-Memory Buffers

import zipfile
from odfdecrypt import decrypt

def extract_content_xml(file_path, password):
    """Extract and return the raw content.xml from an encrypted ODF file."""
    # Decrypt the file (returns a BytesIO buffer)
    decrypted_buffer = decrypt(file_path, password)

    # ODF files are ZIP archives - extract content.xml
    with zipfile.ZipFile(decrypted_buffer, "r") as zf:
        content_xml = zf.read("content.xml").decode("utf-8")
        return content_xml

# Usage
content = extract_content_xml("document.odt", "password")
print(f"Content XML length: {len(content)} characters")

Example 4: Advanced Origin Detection

from odfdecrypt import detect_origin, OpenOfficeOrigin, decrypt

def smart_decrypt(file_path, password):
    """Decrypt with detailed origin information."""
    origin = detect_origin(file_path)

    origin_info = {
        OpenOfficeOrigin.LIBREOFFICE: "LibreOffice (supports modern AES-256-GCM)",
        OpenOfficeOrigin.APACHE_OPEN_OFFICE: "Apache OpenOffice (supports Blowfish-CFB)",
        OpenOfficeOrigin.UNKNOWN: "Unknown origin (will attempt both methods)"
    }

    print(f"Detected origin: {origin_info.get(origin, 'Unknown')}")

    try:
        decrypted_data = decrypt(file_path, password)
        print("✓ Decryption successful")
        return decrypted_data
    except Exception as e:
        print(f"✗ Decryption failed: {e}")
        raise

# Usage
try:
    decrypted = smart_decrypt("mystery_document.odt", "password")
    with open("output.odt", "wb") as f:
        f.write(decrypted.read())
except Exception as e:
    print(f"Could not process document: {e}")

Command-Line Interface

A CLI tool is included for quick decryption tasks.

Basic Usage

odfdecrypt --file document.odt --password mypassword

This creates document_decrypted.odt in the same directory.

Options

odfdecrypt --file document.odt --password mypassword --output decrypted.odt

Short options are also available:

odfdecrypt -f document.odt -p mypassword -o decrypted.odt

CLI Features

  • Auto-detection: Automatically detects LibreOffice vs Apache OpenOffice files
  • Fallback support: Tries LibreOffice first, then Apache OpenOffice if needed
  • Non-encrypted files: Automatically copies files that aren't encrypted
  • Directory creation: Creates output directories if they don't exist
  • Error handling: Provides clear error messages for wrong passwords or corrupted files

Exit Codes

  • 0: Success (decryption or copy completed)
  • 1: Error (file not found, wrong password, decryption failed, etc.)

License

Apache License 2.0. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

odfdecrypt-0.1.0.tar.gz (198.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

odfdecrypt-0.1.0-py3-none-any.whl (153.6 kB view details)

Uploaded Python 3

File details

Details for the file odfdecrypt-0.1.0.tar.gz.

File metadata

  • Download URL: odfdecrypt-0.1.0.tar.gz
  • Upload date:
  • Size: 198.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for odfdecrypt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6e59b70e3ac9286b8edbcf20adf5994f0cb4e1f0aa194803fcd05c847baa49c1
MD5 ebaa6ce864c86646f23aeec72da9c0ac
BLAKE2b-256 7ab3f9d14aa9a47a3da47a67ec0998d29af5e1b3a7323cce81d1f32e98b27a7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for odfdecrypt-0.1.0.tar.gz:

Publisher: publish.yml on Horsmann/odfdecrypt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file odfdecrypt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: odfdecrypt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 153.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for odfdecrypt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d7ce12c7f8407cf8901dec62f5a650a6ec9f5f1b2e94f0906d3cde38b468bb3f
MD5 9c5095b114a9470bba8733641f7b3c61
BLAKE2b-256 82d678363eb7ec61a86c30bef9e84e42abbd7ed3271a524bb1d71a1f2c59698a

See more details on using hashes here.

Provenance

The following attestation bundles were made for odfdecrypt-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Horsmann/odfdecrypt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page