Skip to main content

Standard-library toolkit for reading, editing, and writing OpenDocument Format files (ODT, ODP, ODS, ODG)

Project description

open-document-lib

A standard-library toolkit for reading, editing, and writing OpenDocument Format files — text documents (.odt), presentations (.odp), spreadsheets (.ods), and drawings (.odg) — plus their flat (single-XML) variants.

open-document-lib is the shared library behind the open-document-skills agent skills. The core has no dependencies beyond the Python standard library; a few helpers opt into lxml or bibtexparser when present.

Install

pip install open-document-lib

# optional extras
pip install open-document-lib[validate]    # lxml — RelaxNG schema validation
pip install open-document-lib[scholarly]   # bibtexparser — BibTeX citation ingest

Requires Python 3.10+. The package ships py.typed, so type checkers see its annotations.

Quick start

from pathlib import Path
from xml.etree import ElementTree as ET
from odf_lib import (
    parse_xml_from_zip, xml_bytes, write_odf_with_replacements,
    replace_text_in_element, update_meta_for_edit,
)

src = Path("report.odt")
content = parse_xml_from_zip(src, "content.xml")

# Structure-preserving find/replace across the document body.
text_ns = "urn:oasis:names:tc:opendocument:xmlns:text:1.0"
for para in content.iter(f"{{{text_ns}}}p"):
    replace_text_in_element(para, "{{CLIENT}}", "ACME GmbH")

# Stamp the edit into meta.xml (modification date, generator, cycle count).
meta = parse_xml_from_zip(src, "meta.xml")
# update_meta_for_edit needs a namespace map + qualified-name helper;
# the skills' *_common.py wrappers supply these.

write_odf_with_replacements(
    src, Path("report-out.odt"),
    {"content.xml": xml_bytes(content)},
    "application/vnd.oasis.opendocument.text",
)

Flat-ODF round-trip:

from odf_lib import pack_flat_odf, unpack_flat_odf

pack_flat_odf(Path("deck.odp"), Path("deck.fodp"))    # ZIP  → single XML
unpack_flat_odf(Path("deck.fodp"), Path("deck.odp"))  # XML  → ZIP package

API reference

Everything below is exported directly from the odf_lib package and is covered by semantic versioning from 1.0 onward. Anything in odf_lib.odf_common that is not listed here (notably _-prefixed helpers) is internal and may change without notice.

Constants

Name Description
VERSION Library version string (also odf_lib.__version__).
ODF_NAMESPACES dict[str, str] of ODF namespace prefixes → URIs.
FLAT_EXTENSIONS Mapping of ODF mimetype → flat-file extension (.fodt, …).

ZIP / XML core

Signature Description
parse_xml_from_zip(path, member) -> ET.Element Parse one XML member of an ODF ZIP.
xml_bytes(root) -> bytes Serialize an element to UTF-8 bytes with XML declaration.
write_odf_with_replacements(input_path, output_path, replacements, mimetype_value) -> None Copy an ODF package, swapping named members; mimetype stays first and stored.
pack_dir_as_odf(source_dir, output_path, mimetype_value) -> None Repack an extracted directory into a valid ODF file.
copy_into_package(input_path, output_path, package_path, source, replacements, mimetype_value) -> None Add a single file to a package plus member replacements.
copy_with_multiple_members(input_path, output_path, new_members, replacements, mimetype_value) -> None Add several new members in one pass (e.g. Object N/ sub-packages).
unpack_to_temp(path) -> tempfile.TemporaryDirectory Extract a package to a managed temp directory.

Manifest and media

Signature Description
ensure_manifest_entry(manifest_root, full_path, media_type, ns, q_fn) -> None Add or update a manifest:file-entry.
media_type_for(path) -> str MIME type from a file extension.
sniff_image_mime(path) -> str MIME type from magic bytes, with extension fallback.
unique_picture_name(existing, image) -> str Collision-free Pictures/ filename.
unique_object_name(existing) -> str Next free Object N sub-package name.

Metadata

Signature Description
update_meta_for_edit(meta_root, ns, q_fn) -> None Refresh meta:modification-date/generator and bump editing-cycles.

Flat ODF

Signature Description
pack_flat_odf(input_zip, output_flat) -> None Convert a zipped ODF to flat single-XML form (pictures and Object N/ sub-packages inlined).
unpack_flat_odf(input_flat, output_zip) -> None Convert a flat ODF back to a zipped package and rebuild the manifest.

Text walker, locator, insertion

Signature Description
replace_text_in_element(element, old, new) -> int Structure-preserving find/replace across text and child tails.
replace_pattern_with_element_in_element(element, pattern, factory) -> int Replace regex matches with generated elements.
find_text_position_in_element(element, needle) -> tuple | None Locate needle, returning (node, "text"|"tail", offset).
insert_after_text_in_element(element, anchor, new_element) -> bool Splice an element in right after an anchor string.
insert_in_paragraph(paragraph, position, new_element) -> None Insert at the start or end of a paragraph.
wrap_text_with_pair_in_element(element, start_anchor, end_anchor, start_element, end_element) -> bool Wrap an intra-paragraph text range with a start/end pair.
wrap_text_across_elements(elements, start_anchor, end_anchor, start_element, end_element) -> bool Same, spanning multiple paragraphs.
ensure_sequence_declarations(text_root, names, ns) -> None Ensure text:sequence-decl entries exist.
clear_children(element) -> None Remove all children of an element.
local_name(tag) -> str Local name from a Clark-notation tag.

Styles and pictures

Signature Description
inject_styles_from_file(input_path, styles_path, output_path, mimetype_value) -> list[str] Replace styles.xml; returns dangling style references.
embed_pictures(input_path, pictures, output_path, mimetype_value, ns, q_fn) -> None Bulk-add images to Pictures/ and the manifest.

Schema validation

Signature Description
ensure_schema(name) -> Path Download and cache an OASIS ODF 1.3 RelaxNG schema (content/manifest).
validate_against_schema(xml_bytes_input, schema_name) -> tuple[bool, list[str]] Validate XML bytes against a cached schema (requires lxml).

External tooling

Signature Description
find_soffice() -> str Locate the LibreOffice soffice binary; raises if absent.
find_pandoc() -> str | None Locate the pandoc binary.
latex_to_mathml(latex) -> bytes Convert a LaTeX snippet to MathML via Pandoc.

Rendering

Signature Description
render_to_pdf(odf_path, outdir) -> Path Render an ODF file to PDF via LibreOffice (isolated profile).
pdf_to_pngs(pdf_path, outdir, dpi=150) -> list[Path] Render each PDF page to a PNG via pdftoppm (Poppler).
build_contact_sheet(images, output_path, columns=0) -> Path Compose page thumbnails into one labelled grid image. Requires Pillow (pip install open-document-lib[render]).

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_document_lib-1.6.0.tar.gz (75.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_document_lib-1.6.0-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file open_document_lib-1.6.0.tar.gz.

File metadata

  • Download URL: open_document_lib-1.6.0.tar.gz
  • Upload date:
  • Size: 75.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for open_document_lib-1.6.0.tar.gz
Algorithm Hash digest
SHA256 eec3bee49c1df9b9683102e518be7d7d59e5dfd3c4fa33f7b9d16d00ea2c7c9a
MD5 8764e0042eb5aeced59de2b132ae1f9b
BLAKE2b-256 9f5ef666e0b4f1c09d06c9c3b7260f4ce003afed7a4dc34875deef0ca64ce24d

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.6.0.tar.gz:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_document_lib-1.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_document_lib-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 788607891165dd6243c5bdb4882e1f2f18798de6fd3285c4b5bcffb6e615b670
MD5 72002f9580a2747c55c5c97c1e3b5a56
BLAKE2b-256 f1639412a57a304d2c007bfbda244cae3adf5de7c98a6eb0e51b890c94bd817e

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.6.0-py3-none-any.whl:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page