Skip to main content

Standard-library toolkit for reading, editing, and writing OpenDocument Format files (ODT, ODP, ODS, ODG)

Project description

open-document-lib

A standard-library toolkit for reading, editing, and writing OpenDocument Format files — text documents (.odt), presentations (.odp), spreadsheets (.ods), and drawings (.odg) — plus their flat (single-XML) variants.

open-document-lib is the shared library behind the open-document-skills agent skills. The core has no dependencies beyond the Python standard library; a few helpers opt into lxml or bibtexparser when present.

Install

pip install open-document-lib

# optional extras
pip install open-document-lib[validate]    # lxml — RelaxNG schema validation
pip install open-document-lib[scholarly]   # bibtexparser — BibTeX citation ingest

Requires Python 3.10+. The package ships py.typed, so type checkers see its annotations.

Quick start

from pathlib import Path
from xml.etree import ElementTree as ET
from odf_lib import (
    parse_xml_from_zip, xml_bytes, write_odf_with_replacements,
    replace_text_in_element, update_meta_for_edit,
)

src = Path("report.odt")
content = parse_xml_from_zip(src, "content.xml")

# Structure-preserving find/replace across the document body.
text_ns = "urn:oasis:names:tc:opendocument:xmlns:text:1.0"
for para in content.iter(f"{{{text_ns}}}p"):
    replace_text_in_element(para, "{{CLIENT}}", "ACME GmbH")

# Stamp the edit into meta.xml (modification date, generator, cycle count).
meta = parse_xml_from_zip(src, "meta.xml")
# update_meta_for_edit needs a namespace map + qualified-name helper;
# the skills' *_common.py wrappers supply these.

write_odf_with_replacements(
    src, Path("report-out.odt"),
    {"content.xml": xml_bytes(content)},
    "application/vnd.oasis.opendocument.text",
)

Flat-ODF round-trip:

from odf_lib import pack_flat_odf, unpack_flat_odf

pack_flat_odf(Path("deck.odp"), Path("deck.fodp"))    # ZIP  → single XML
unpack_flat_odf(Path("deck.fodp"), Path("deck.odp"))  # XML  → ZIP package

API reference

Everything below is exported directly from the odf_lib package and is covered by semantic versioning from 1.0 onward. Anything in odf_lib.odf_common that is not listed here (notably _-prefixed helpers) is internal and may change without notice.

Constants

Name Description
VERSION Library version string (also odf_lib.__version__).
ODF_NAMESPACES dict[str, str] of ODF namespace prefixes → URIs.
FLAT_EXTENSIONS Mapping of ODF mimetype → flat-file extension (.fodt, …).

ZIP / XML core

Signature Description
parse_xml_from_zip(path, member) -> ET.Element Parse one XML member of an ODF ZIP.
xml_bytes(root) -> bytes Serialize an element to UTF-8 bytes with XML declaration.
write_odf_with_replacements(input_path, output_path, replacements, mimetype_value) -> None Copy an ODF package, swapping named members; mimetype stays first and stored.
pack_dir_as_odf(source_dir, output_path, mimetype_value) -> None Repack an extracted directory into a valid ODF file.
copy_into_package(input_path, output_path, package_path, source, replacements, mimetype_value) -> None Add a single file to a package plus member replacements.
copy_with_multiple_members(input_path, output_path, new_members, replacements, mimetype_value) -> None Add several new members in one pass (e.g. Object N/ sub-packages).
unpack_to_temp(path) -> tempfile.TemporaryDirectory Extract a package to a managed temp directory.

Manifest and media

Signature Description
ensure_manifest_entry(manifest_root, full_path, media_type, ns, q_fn) -> None Add or update a manifest:file-entry.
media_type_for(path) -> str MIME type from a file extension.
sniff_image_mime(path) -> str MIME type from magic bytes, with extension fallback.
unique_picture_name(existing, image) -> str Collision-free Pictures/ filename.
unique_object_name(existing) -> str Next free Object N sub-package name.

Metadata

Signature Description
update_meta_for_edit(meta_root, ns, q_fn) -> None Refresh meta:modification-date/generator and bump editing-cycles.

Flat ODF

Signature Description
pack_flat_odf(input_zip, output_flat) -> None Convert a zipped ODF to flat single-XML form (pictures and Object N/ sub-packages inlined).
unpack_flat_odf(input_flat, output_zip) -> None Convert a flat ODF back to a zipped package and rebuild the manifest.

Text walker, locator, insertion

Signature Description
replace_text_in_element(element, old, new) -> int Structure-preserving find/replace across text and child tails.
replace_pattern_with_element_in_element(element, pattern, factory) -> int Replace regex matches with generated elements.
find_text_position_in_element(element, needle) -> tuple | None Locate needle, returning (node, "text"|"tail", offset).
insert_after_text_in_element(element, anchor, new_element) -> bool Splice an element in right after an anchor string.
insert_in_paragraph(paragraph, position, new_element) -> None Insert at the start or end of a paragraph.
wrap_text_with_pair_in_element(element, start_anchor, end_anchor, start_element, end_element) -> bool Wrap an intra-paragraph text range with a start/end pair.
wrap_text_across_elements(elements, start_anchor, end_anchor, start_element, end_element) -> bool Same, spanning multiple paragraphs.
ensure_sequence_declarations(text_root, names, ns) -> None Ensure text:sequence-decl entries exist.
clear_children(element) -> None Remove all children of an element.
local_name(tag) -> str Local name from a Clark-notation tag.

Styles and pictures

Signature Description
inject_styles_from_file(input_path, styles_path, output_path, mimetype_value) -> list[str] Replace styles.xml; returns dangling style references.
embed_pictures(input_path, pictures, output_path, mimetype_value, ns, q_fn) -> None Bulk-add images to Pictures/ and the manifest.

Schema validation

Signature Description
ensure_schema(name) -> Path Download and cache an OASIS ODF 1.3 RelaxNG schema (content/manifest).
validate_against_schema(xml_bytes_input, schema_name) -> tuple[bool, list[str]] Validate XML bytes against a cached schema (requires lxml).

External tooling

Signature Description
find_soffice() -> str Locate the LibreOffice soffice binary; raises if absent.
find_pandoc() -> str | None Locate the pandoc binary.
latex_to_mathml(latex) -> bytes Convert a LaTeX snippet to MathML via Pandoc.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_document_lib-1.1.0.tar.gz (63.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_document_lib-1.1.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file open_document_lib-1.1.0.tar.gz.

File metadata

  • Download URL: open_document_lib-1.1.0.tar.gz
  • Upload date:
  • Size: 63.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for open_document_lib-1.1.0.tar.gz
Algorithm Hash digest
SHA256 5ec9f2db5c40a6e44e3aa30df512c97cc4c2322bb387835bcabab05b0e8822c4
MD5 f74b711dd20eddf6fa0ee42fad2ceebc
BLAKE2b-256 5be27a0c1dd9d0dbaa83a96b78059124e924a7c55d1a29987e0b408e636a31f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.1.0.tar.gz:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_document_lib-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_document_lib-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1f85f2536850d72c17f9c475083b40c48f15a88232d2782fbaf2f9048e07b80
MD5 16d5c40d371f094957e6436214c80f91
BLAKE2b-256 b427223f6ca24ae2ff3ac0b45cb5c27d880c1efe1cc66a0e64a1d1e99fc73854

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_document_lib-1.1.0-py3-none-any.whl:

Publisher: publish.yml on leiverkus/open-document-skills

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page