Skip to main content

Standard-library toolkit for reading, editing, and writing OpenDocument Format files (ODT, ODP, ODS, ODG)

Project description

open-document-lib

A standard-library toolkit for reading, editing, and writing OpenDocument Format files — text documents (.odt), presentations (.odp), spreadsheets (.ods), and drawings (.odg) — plus their flat (single-XML) variants.

open-document-lib is the shared library behind the open-document-skills agent skills. The core has no dependencies beyond the Python standard library; a few helpers opt into lxml or bibtexparser when present.

Install

pip install open-document-lib

# optional extras
pip install open-document-lib[validate]    # lxml — RelaxNG schema validation
pip install open-document-lib[scholarly]   # bibtexparser — BibTeX citation ingest

Requires Python 3.10+. The package ships py.typed, so type checkers see its annotations.

Quick start

from pathlib import Path
from xml.etree import ElementTree as ET
from odf_lib import (
    parse_xml_from_zip, xml_bytes, write_odf_with_replacements,
    replace_text_in_element, update_meta_for_edit,
)

src = Path("report.odt")
content = parse_xml_from_zip(src, "content.xml")

# Structure-preserving find/replace across the document body.
text_ns = "urn:oasis:names:tc:opendocument:xmlns:text:1.0"
for para in content.iter(f"{{{text_ns}}}p"):
    replace_text_in_element(para, "{{CLIENT}}", "ACME GmbH")

# Stamp the edit into meta.xml (modification date, generator, cycle count).
meta = parse_xml_from_zip(src, "meta.xml")
# update_meta_for_edit needs a namespace map + qualified-name helper;
# the skills' *_common.py wrappers supply these.

write_odf_with_replacements(
    src, Path("report-out.odt"),
    {"content.xml": xml_bytes(content)},
    "application/vnd.oasis.opendocument.text",
)

Flat-ODF round-trip:

from odf_lib import pack_flat_odf, unpack_flat_odf

pack_flat_odf(Path("deck.odp"), Path("deck.fodp"))    # ZIP  → single XML
unpack_flat_odf(Path("deck.fodp"), Path("deck.odp"))  # XML  → ZIP package

API reference

Everything below is exported directly from the odf_lib package and is covered by semantic versioning from 1.0 onward. Anything in odf_lib.odf_common that is not listed here (notably _-prefixed helpers) is internal and may change without notice.

Constants

Name Description
VERSION Library version string (also odf_lib.__version__).
ODF_NAMESPACES dict[str, str] of ODF namespace prefixes → URIs.
FLAT_EXTENSIONS Mapping of ODF mimetype → flat-file extension (.fodt, …).

ZIP / XML core

Signature Description
parse_xml_from_zip(path, member) -> ET.Element Parse one XML member of an ODF ZIP.
xml_bytes(root) -> bytes Serialize an element to UTF-8 bytes with XML declaration.
write_odf_with_replacements(input_path, output_path, replacements, mimetype_value) -> None Copy an ODF package, swapping named members; mimetype stays first and stored.
pack_dir_as_odf(source_dir, output_path, mimetype_value) -> None Repack an extracted directory into a valid ODF file.
copy_into_package(input_path, output_path, package_path, source, replacements, mimetype_value) -> None Add a single file to a package plus member replacements.
copy_with_multiple_members(input_path, output_path, new_members, replacements, mimetype_value) -> None Add several new members in one pass (e.g. Object N/ sub-packages).
unpack_to_temp(path) -> tempfile.TemporaryDirectory Extract a package to a managed temp directory.

Manifest and media

Signature Description
ensure_manifest_entry(manifest_root, full_path, media_type, ns, q_fn) -> None Add or update a manifest:file-entry.
media_type_for(path) -> str MIME type from a file extension.
sniff_image_mime(path) -> str MIME type from magic bytes, with extension fallback.
unique_picture_name(existing, image) -> str Collision-free Pictures/ filename.
unique_object_name(existing) -> str Next free Object N sub-package name.

Metadata

Signature Description
update_meta_for_edit(meta_root, ns, q_fn) -> None Refresh meta:modification-date/generator and bump editing-cycles.

Flat ODF

Signature Description
pack_flat_odf(input_zip, output_flat) -> None Convert a zipped ODF to flat single-XML form (pictures and Object N/ sub-packages inlined).
unpack_flat_odf(input_flat, output_zip) -> None Convert a flat ODF back to a zipped package and rebuild the manifest.

Text walker, locator, insertion

Signature Description
replace_text_in_element(element, old, new) -> int Structure-preserving find/replace across text and child tails.
replace_pattern_with_element_in_element(element, pattern, factory) -> int Replace regex matches with generated elements.
find_text_position_in_element(element, needle) -> tuple | None Locate needle, returning (node, "text"|"tail", offset).
insert_after_text_in_element(element, anchor, new_element) -> bool Splice an element in right after an anchor string.
insert_in_paragraph(paragraph, position, new_element) -> None Insert at the start or end of a paragraph.
wrap_text_with_pair_in_element(element, start_anchor, end_anchor, start_element, end_element) -> bool Wrap an intra-paragraph text range with a start/end pair.
wrap_text_across_elements(elements, start_anchor, end_anchor, start_element, end_element) -> bool Same, spanning multiple paragraphs.
ensure_sequence_declarations(text_root, names, ns) -> None Ensure text:sequence-decl entries exist.
clear_children(element) -> None Remove all children of an element.
local_name(tag) -> str Local name from a Clark-notation tag.

Styles and pictures

Signature Description
inject_styles_from_file(input_path, styles_path, output_path, mimetype_value) -> list[str] Replace styles.xml; returns dangling style references.
embed_pictures(input_path, pictures, output_path, mimetype_value, ns, q_fn) -> None Bulk-add images to Pictures/ and the manifest.

Schema validation

Signature Description
ensure_schema(name) -> Path Download and cache an OASIS ODF 1.3 RelaxNG schema (content/manifest).
validate_against_schema(xml_bytes_input, schema_name) -> tuple[bool, list[str]] Validate XML bytes against a cached schema (requires lxml).

External tooling

Signature Description
find_soffice() -> str Locate the LibreOffice soffice binary; raises if absent.
find_pandoc() -> str | None Locate the pandoc binary.
latex_to_mathml(latex) -> bytes Convert a LaTeX snippet to MathML via Pandoc.

Rendering

Signature Description
render_to_pdf(odf_path, outdir) -> Path Render an ODF file to PDF via LibreOffice (isolated profile).
pdf_to_pngs(pdf_path, outdir, dpi=150) -> list[Path] Render each PDF page to a PNG via pdftoppm (Poppler).
build_contact_sheet(images, output_path, columns=0) -> Path Compose page thumbnails into one labelled grid image. Requires Pillow (pip install open-document-lib[render]).

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_document_lib-1.4.0.tar.gz (70.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_document_lib-1.4.0-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file open_document_lib-1.4.0.tar.gz.

File metadata

  • Download URL: open_document_lib-1.4.0.tar.gz
  • Upload date:
  • Size: 70.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for open_document_lib-1.4.0.tar.gz
Algorithm Hash digest
SHA256 0954c9d0479d3ce566c80dc82fa600edb6e18bbda6cad72d1c038a268e73d81c
MD5 13a89fd911888d5ed25650313ea2a222
BLAKE2b-256 4384a61eb118b5cf5120dcf2b996b605b0ddc604a6b25e14e41d622b73d9985c

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.4.0.tar.gz:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_document_lib-1.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_document_lib-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d9ef4ff4b891a3dcbcc52b3d3798cd9e1f13fbaa5315850c2a100ffcdb2ddb1d
MD5 0c224c1f5a287517fb6ebe245c6565a9
BLAKE2b-256 a6c95f357cf49a693376e02985c6707438f42d7ffa575be8fcfbd1a41660b434

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.4.0-py3-none-any.whl:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page