Skip to main content

Standard-library toolkit for reading, editing, and writing OpenDocument Format files (ODT, ODP, ODS, ODG)

Project description

open-document-lib

A standard-library toolkit for reading, editing, and writing OpenDocument Format files — text documents (.odt), presentations (.odp), spreadsheets (.ods), and drawings (.odg) — plus their flat (single-XML) variants.

open-document-lib is the shared library behind the open-document-skills agent skills. The core has no dependencies beyond the Python standard library; a few helpers opt into lxml or bibtexparser when present.

Install

pip install open-document-lib

# optional extras
pip install open-document-lib[validate]    # lxml — RelaxNG schema validation
pip install open-document-lib[scholarly]   # bibtexparser — BibTeX citation ingest

Requires Python 3.10+. The package ships py.typed, so type checkers see its annotations.

Quick start

from pathlib import Path
from xml.etree import ElementTree as ET
from odf_lib import (
    parse_xml_from_zip, xml_bytes, write_odf_with_replacements,
    replace_text_in_element, update_meta_for_edit,
)

src = Path("report.odt")
content = parse_xml_from_zip(src, "content.xml")

# Structure-preserving find/replace across the document body.
text_ns = "urn:oasis:names:tc:opendocument:xmlns:text:1.0"
for para in content.iter(f"{{{text_ns}}}p"):
    replace_text_in_element(para, "{{CLIENT}}", "ACME GmbH")

# Stamp the edit into meta.xml (modification date, generator, cycle count).
meta = parse_xml_from_zip(src, "meta.xml")
# update_meta_for_edit needs a namespace map + qualified-name helper;
# the skills' *_common.py wrappers supply these.

write_odf_with_replacements(
    src, Path("report-out.odt"),
    {"content.xml": xml_bytes(content)},
    "application/vnd.oasis.opendocument.text",
)

Flat-ODF round-trip:

from odf_lib import pack_flat_odf, unpack_flat_odf

pack_flat_odf(Path("deck.odp"), Path("deck.fodp"))    # ZIP  → single XML
unpack_flat_odf(Path("deck.fodp"), Path("deck.odp"))  # XML  → ZIP package

API reference

Everything below is exported directly from the odf_lib package and is covered by semantic versioning from 1.0 onward. Anything in odf_lib.odf_common that is not listed here (notably _-prefixed helpers) is internal and may change without notice.

Constants

Name Description
VERSION Library version string (also odf_lib.__version__).
ODF_NAMESPACES dict[str, str] of ODF namespace prefixes → URIs.
FLAT_EXTENSIONS Mapping of ODF mimetype → flat-file extension (.fodt, …).

ZIP / XML core

Signature Description
parse_xml_from_zip(path, member) -> ET.Element Parse one XML member of an ODF ZIP.
xml_bytes(root) -> bytes Serialize an element to UTF-8 bytes with XML declaration.
write_odf_with_replacements(input_path, output_path, replacements, mimetype_value) -> None Copy an ODF package, swapping named members; mimetype stays first and stored.
pack_dir_as_odf(source_dir, output_path, mimetype_value) -> None Repack an extracted directory into a valid ODF file.
copy_into_package(input_path, output_path, package_path, source, replacements, mimetype_value) -> None Add a single file to a package plus member replacements.
copy_with_multiple_members(input_path, output_path, new_members, replacements, mimetype_value) -> None Add several new members in one pass (e.g. Object N/ sub-packages).
unpack_to_temp(path) -> tempfile.TemporaryDirectory Extract a package to a managed temp directory.

Manifest and media

Signature Description
ensure_manifest_entry(manifest_root, full_path, media_type, ns, q_fn) -> None Add or update a manifest:file-entry.
media_type_for(path) -> str MIME type from a file extension.
sniff_image_mime(path) -> str MIME type from magic bytes, with extension fallback.
unique_picture_name(existing, image) -> str Collision-free Pictures/ filename.
unique_object_name(existing) -> str Next free Object N sub-package name.

Metadata

Signature Description
update_meta_for_edit(meta_root, ns, q_fn) -> None Refresh meta:modification-date/generator and bump editing-cycles.

Flat ODF

Signature Description
pack_flat_odf(input_zip, output_flat) -> None Convert a zipped ODF to flat single-XML form (pictures and Object N/ sub-packages inlined).
unpack_flat_odf(input_flat, output_zip) -> None Convert a flat ODF back to a zipped package and rebuild the manifest.

Text walker, locator, insertion

Signature Description
replace_text_in_element(element, old, new) -> int Structure-preserving find/replace across text and child tails.
replace_pattern_with_element_in_element(element, pattern, factory) -> int Replace regex matches with generated elements.
find_text_position_in_element(element, needle) -> tuple | None Locate needle, returning (node, "text"|"tail", offset).
insert_after_text_in_element(element, anchor, new_element) -> bool Splice an element in right after an anchor string.
insert_in_paragraph(paragraph, position, new_element) -> None Insert at the start or end of a paragraph.
wrap_text_with_pair_in_element(element, start_anchor, end_anchor, start_element, end_element) -> bool Wrap an intra-paragraph text range with a start/end pair.
wrap_text_across_elements(elements, start_anchor, end_anchor, start_element, end_element) -> bool Same, spanning multiple paragraphs.
ensure_sequence_declarations(text_root, names, ns) -> None Ensure text:sequence-decl entries exist.
clear_children(element) -> None Remove all children of an element.
local_name(tag) -> str Local name from a Clark-notation tag.

Styles and pictures

Signature Description
inject_styles_from_file(input_path, styles_path, output_path, mimetype_value) -> list[str] Replace styles.xml; returns dangling style references.
embed_pictures(input_path, pictures, output_path, mimetype_value, ns, q_fn) -> None Bulk-add images to Pictures/ and the manifest.

Schema validation

Signature Description
ensure_schema(name) -> Path Download and cache an OASIS ODF 1.3 RelaxNG schema (content/manifest).
validate_against_schema(xml_bytes_input, schema_name) -> tuple[bool, list[str]] Validate XML bytes against a cached schema (requires lxml).

External tooling

Signature Description
find_soffice() -> str Locate the LibreOffice soffice binary; raises if absent.
find_pandoc() -> str | None Locate the pandoc binary.
latex_to_mathml(latex) -> bytes Convert a LaTeX snippet to MathML via Pandoc.

Rendering

Signature Description
render_to_pdf(odf_path, outdir) -> Path Render an ODF file to PDF via LibreOffice (isolated profile).
pdf_to_pngs(pdf_path, outdir, dpi=150) -> list[Path] Render each PDF page to a PNG via pdftoppm (Poppler).
build_contact_sheet(images, output_path, columns=0) -> Path Compose page thumbnails into one labelled grid image. Requires Pillow (pip install open-document-lib[render]).

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_document_lib-1.8.0.tar.gz (81.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_document_lib-1.8.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file open_document_lib-1.8.0.tar.gz.

File metadata

  • Download URL: open_document_lib-1.8.0.tar.gz
  • Upload date:
  • Size: 81.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for open_document_lib-1.8.0.tar.gz
Algorithm Hash digest
SHA256 fe3e3fe5034f7369252ae76c969102e66901a6a59a960cbe2e469070948bf9cb
MD5 a3deffb695560cb25e79f1d7ca7f6076
BLAKE2b-256 949f91a7eaf93547cdc43cced58b2e24d61c53c52663e4611280b050288b8db1

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.8.0.tar.gz:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_document_lib-1.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_document_lib-1.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ce28e7251377d2e1640a2f3052c44a7e5d8714f236277069d6f62c5217694c0
MD5 d72f5b40773ef58afeb0d155cc3bc11a
BLAKE2b-256 4e3caf1db6591e1e42b7c07d3c457566cce366c1c79dcd5ae06c16602295576d

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.8.0-py3-none-any.whl:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page