Skip to main content

Standard-library toolkit for reading, editing, and writing OpenDocument Format files (ODT, ODP, ODS, ODG)

Project description

open-document-lib

A standard-library toolkit for reading, editing, and writing OpenDocument Format files — text documents (.odt), presentations (.odp), spreadsheets (.ods), and drawings (.odg) — plus their flat (single-XML) variants.

open-document-lib is the shared library behind the open-document-skills agent skills. The core has no dependencies beyond the Python standard library; a few helpers opt into lxml or bibtexparser when present.

Install

pip install open-document-lib

# optional extras
pip install open-document-lib[validate]    # lxml — RelaxNG schema validation
pip install open-document-lib[scholarly]   # bibtexparser — BibTeX citation ingest

Requires Python 3.10+. The package ships py.typed, so type checkers see its annotations.

Quick start

from pathlib import Path
from xml.etree import ElementTree as ET
from odf_lib import (
    parse_xml_from_zip, xml_bytes, write_odf_with_replacements,
    replace_text_in_element, update_meta_for_edit,
)

src = Path("report.odt")
content = parse_xml_from_zip(src, "content.xml")

# Structure-preserving find/replace across the document body.
text_ns = "urn:oasis:names:tc:opendocument:xmlns:text:1.0"
for para in content.iter(f"{{{text_ns}}}p"):
    replace_text_in_element(para, "{{CLIENT}}", "ACME GmbH")

# Stamp the edit into meta.xml (modification date, generator, cycle count).
meta = parse_xml_from_zip(src, "meta.xml")
# update_meta_for_edit needs a namespace map + qualified-name helper;
# the skills' *_common.py wrappers supply these.

write_odf_with_replacements(
    src, Path("report-out.odt"),
    {"content.xml": xml_bytes(content)},
    "application/vnd.oasis.opendocument.text",
)

Flat-ODF round-trip:

from odf_lib import pack_flat_odf, unpack_flat_odf

pack_flat_odf(Path("deck.odp"), Path("deck.fodp"))    # ZIP  → single XML
unpack_flat_odf(Path("deck.fodp"), Path("deck.odp"))  # XML  → ZIP package

API reference

Everything below is exported directly from the odf_lib package and is covered by semantic versioning from 1.0 onward. Anything in odf_lib.odf_common that is not listed here (notably _-prefixed helpers) is internal and may change without notice.

Constants

Name Description
VERSION Library version string (also odf_lib.__version__).
ODF_NAMESPACES dict[str, str] of ODF namespace prefixes → URIs.
FLAT_EXTENSIONS Mapping of ODF mimetype → flat-file extension (.fodt, …).

ZIP / XML core

Signature Description
parse_xml_from_zip(path, member) -> ET.Element Parse one XML member of an ODF ZIP.
xml_bytes(root) -> bytes Serialize an element to UTF-8 bytes with XML declaration.
write_odf_with_replacements(input_path, output_path, replacements, mimetype_value) -> None Copy an ODF package, swapping named members; mimetype stays first and stored.
pack_dir_as_odf(source_dir, output_path, mimetype_value) -> None Repack an extracted directory into a valid ODF file.
copy_into_package(input_path, output_path, package_path, source, replacements, mimetype_value) -> None Add a single file to a package plus member replacements.
copy_with_multiple_members(input_path, output_path, new_members, replacements, mimetype_value) -> None Add several new members in one pass (e.g. Object N/ sub-packages).
unpack_to_temp(path) -> tempfile.TemporaryDirectory Extract a package to a managed temp directory.

Manifest and media

Signature Description
ensure_manifest_entry(manifest_root, full_path, media_type, ns, q_fn) -> None Add or update a manifest:file-entry.
media_type_for(path) -> str MIME type from a file extension.
sniff_image_mime(path) -> str MIME type from magic bytes, with extension fallback.
unique_picture_name(existing, image) -> str Collision-free Pictures/ filename.
unique_object_name(existing) -> str Next free Object N sub-package name.

Metadata

Signature Description
update_meta_for_edit(meta_root, ns, q_fn) -> None Refresh meta:modification-date/generator and bump editing-cycles.

Flat ODF

Signature Description
pack_flat_odf(input_zip, output_flat) -> None Convert a zipped ODF to flat single-XML form (pictures and Object N/ sub-packages inlined).
unpack_flat_odf(input_flat, output_zip) -> None Convert a flat ODF back to a zipped package and rebuild the manifest.

Text walker, locator, insertion

Signature Description
replace_text_in_element(element, old, new) -> int Structure-preserving find/replace across text and child tails.
replace_pattern_with_element_in_element(element, pattern, factory) -> int Replace regex matches with generated elements.
find_text_position_in_element(element, needle) -> tuple | None Locate needle, returning (node, "text"|"tail", offset).
insert_after_text_in_element(element, anchor, new_element) -> bool Splice an element in right after an anchor string.
insert_in_paragraph(paragraph, position, new_element) -> None Insert at the start or end of a paragraph.
wrap_text_with_pair_in_element(element, start_anchor, end_anchor, start_element, end_element) -> bool Wrap an intra-paragraph text range with a start/end pair.
wrap_text_across_elements(elements, start_anchor, end_anchor, start_element, end_element) -> bool Same, spanning multiple paragraphs.
ensure_sequence_declarations(text_root, names, ns) -> None Ensure text:sequence-decl entries exist.
clear_children(element) -> None Remove all children of an element.
local_name(tag) -> str Local name from a Clark-notation tag.

Styles and pictures

Signature Description
inject_styles_from_file(input_path, styles_path, output_path, mimetype_value) -> list[str] Replace styles.xml; returns dangling style references.
embed_pictures(input_path, pictures, output_path, mimetype_value, ns, q_fn) -> None Bulk-add images to Pictures/ and the manifest.

Schema validation

Signature Description
ensure_schema(name) -> Path Download and cache an OASIS ODF 1.3 RelaxNG schema (content/manifest).
validate_against_schema(xml_bytes_input, schema_name) -> tuple[bool, list[str]] Validate XML bytes against a cached schema (requires lxml).

External tooling

Signature Description
find_soffice() -> str Locate the LibreOffice soffice binary; raises if absent.
find_pandoc() -> str | None Locate the pandoc binary.
latex_to_mathml(latex) -> bytes Convert a LaTeX snippet to MathML via Pandoc.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_document_lib-1.0.0.tar.gz (61.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_document_lib-1.0.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file open_document_lib-1.0.0.tar.gz.

File metadata

  • Download URL: open_document_lib-1.0.0.tar.gz
  • Upload date:
  • Size: 61.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for open_document_lib-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ec4f968c2d822dc2340b8e1cfd98c169269636085e87f6c76ec0c99222a05fed
MD5 088bed5bd8e705b5da759cc4c0722721
BLAKE2b-256 c9c7b4639f355bfe7d07de0722c7250716e9552f4623192edf79174a8f5330fd

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.0.0.tar.gz:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_document_lib-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_document_lib-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d3d694fb404de7220df6f778a57b03e6e787901fe1927ee53c48146ffab9509
MD5 88a936e07ebcb50fe9ece5b6965bfd06
BLAKE2b-256 dce47bdbd65beab3b76784cd73bee25d7ac0eb92088ede862fb61f5b4add1036

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.0.0-py3-none-any.whl:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page