Standard-library toolkit for reading, editing, and writing OpenDocument Format files (ODT, ODP, ODS, ODG)
Project description
open-document-lib
A standard-library toolkit for reading, editing, and writing OpenDocument
Format files — text documents (.odt), presentations (.odp),
spreadsheets (.ods), and drawings (.odg) — plus their flat (single-XML)
variants.
open-document-lib is the shared library behind the
open-document-skills
agent skills. The core has no dependencies beyond the Python standard
library; a few helpers opt into lxml or bibtexparser when present.
Install
pip install open-document-lib
# optional extras
pip install open-document-lib[validate] # lxml — RelaxNG schema validation
pip install open-document-lib[scholarly] # bibtexparser — BibTeX citation ingest
Requires Python 3.10+. The package ships py.typed, so type checkers see
its annotations.
Quick start
from pathlib import Path
from xml.etree import ElementTree as ET
from odf_lib import (
parse_xml_from_zip, xml_bytes, write_odf_with_replacements,
replace_text_in_element, update_meta_for_edit,
)
src = Path("report.odt")
content = parse_xml_from_zip(src, "content.xml")
# Structure-preserving find/replace across the document body.
text_ns = "urn:oasis:names:tc:opendocument:xmlns:text:1.0"
for para in content.iter(f"{{{text_ns}}}p"):
replace_text_in_element(para, "{{CLIENT}}", "ACME GmbH")
# Stamp the edit into meta.xml (modification date, generator, cycle count).
meta = parse_xml_from_zip(src, "meta.xml")
# update_meta_for_edit needs a namespace map + qualified-name helper;
# the skills' *_common.py wrappers supply these.
write_odf_with_replacements(
src, Path("report-out.odt"),
{"content.xml": xml_bytes(content)},
"application/vnd.oasis.opendocument.text",
)
Flat-ODF round-trip:
from odf_lib import pack_flat_odf, unpack_flat_odf
pack_flat_odf(Path("deck.odp"), Path("deck.fodp")) # ZIP → single XML
unpack_flat_odf(Path("deck.fodp"), Path("deck.odp")) # XML → ZIP package
API reference
Everything below is exported directly from the odf_lib package and is
covered by semantic versioning from 1.0 onward. Anything in
odf_lib.odf_common that is not listed here (notably _-prefixed
helpers) is internal and may change without notice.
Constants
| Name | Description |
|---|---|
VERSION |
Library version string (also odf_lib.__version__). |
ODF_NAMESPACES |
dict[str, str] of ODF namespace prefixes → URIs. |
FLAT_EXTENSIONS |
Mapping of ODF mimetype → flat-file extension (.fodt, …). |
ZIP / XML core
| Signature | Description |
|---|---|
parse_xml_from_zip(path, member) -> ET.Element |
Parse one XML member of an ODF ZIP. |
xml_bytes(root) -> bytes |
Serialize an element to UTF-8 bytes with XML declaration. |
write_odf_with_replacements(input_path, output_path, replacements, mimetype_value) -> None |
Copy an ODF package, swapping named members; mimetype stays first and stored. |
pack_dir_as_odf(source_dir, output_path, mimetype_value) -> None |
Repack an extracted directory into a valid ODF file. |
copy_into_package(input_path, output_path, package_path, source, replacements, mimetype_value) -> None |
Add a single file to a package plus member replacements. |
copy_with_multiple_members(input_path, output_path, new_members, replacements, mimetype_value) -> None |
Add several new members in one pass (e.g. Object N/ sub-packages). |
unpack_to_temp(path) -> tempfile.TemporaryDirectory |
Extract a package to a managed temp directory. |
Manifest and media
| Signature | Description |
|---|---|
ensure_manifest_entry(manifest_root, full_path, media_type, ns, q_fn) -> None |
Add or update a manifest:file-entry. |
media_type_for(path) -> str |
MIME type from a file extension. |
sniff_image_mime(path) -> str |
MIME type from magic bytes, with extension fallback. |
unique_picture_name(existing, image) -> str |
Collision-free Pictures/ filename. |
unique_object_name(existing) -> str |
Next free Object N sub-package name. |
Metadata
| Signature | Description |
|---|---|
update_meta_for_edit(meta_root, ns, q_fn) -> None |
Refresh meta:modification-date/generator and bump editing-cycles. |
Flat ODF
| Signature | Description |
|---|---|
pack_flat_odf(input_zip, output_flat) -> None |
Convert a zipped ODF to flat single-XML form (pictures and Object N/ sub-packages inlined). |
unpack_flat_odf(input_flat, output_zip) -> None |
Convert a flat ODF back to a zipped package and rebuild the manifest. |
Text walker, locator, insertion
| Signature | Description |
|---|---|
replace_text_in_element(element, old, new) -> int |
Structure-preserving find/replace across text and child tails. |
replace_pattern_with_element_in_element(element, pattern, factory) -> int |
Replace regex matches with generated elements. |
find_text_position_in_element(element, needle) -> tuple | None |
Locate needle, returning (node, "text"|"tail", offset). |
insert_after_text_in_element(element, anchor, new_element) -> bool |
Splice an element in right after an anchor string. |
insert_in_paragraph(paragraph, position, new_element) -> None |
Insert at the start or end of a paragraph. |
wrap_text_with_pair_in_element(element, start_anchor, end_anchor, start_element, end_element) -> bool |
Wrap an intra-paragraph text range with a start/end pair. |
wrap_text_across_elements(elements, start_anchor, end_anchor, start_element, end_element) -> bool |
Same, spanning multiple paragraphs. |
ensure_sequence_declarations(text_root, names, ns) -> None |
Ensure text:sequence-decl entries exist. |
clear_children(element) -> None |
Remove all children of an element. |
local_name(tag) -> str |
Local name from a Clark-notation tag. |
Styles and pictures
| Signature | Description |
|---|---|
inject_styles_from_file(input_path, styles_path, output_path, mimetype_value) -> list[str] |
Replace styles.xml; returns dangling style references. |
embed_pictures(input_path, pictures, output_path, mimetype_value, ns, q_fn) -> None |
Bulk-add images to Pictures/ and the manifest. |
Schema validation
| Signature | Description |
|---|---|
ensure_schema(name) -> Path |
Download and cache an OASIS ODF 1.3 RelaxNG schema (content/manifest). |
validate_against_schema(xml_bytes_input, schema_name) -> tuple[bool, list[str]] |
Validate XML bytes against a cached schema (requires lxml). |
External tooling
| Signature | Description |
|---|---|
find_soffice() -> str |
Locate the LibreOffice soffice binary; raises if absent. |
find_pandoc() -> str | None |
Locate the pandoc binary. |
latex_to_mathml(latex) -> bytes |
Convert a LaTeX snippet to MathML via Pandoc. |
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file open_document_lib-1.1.0.tar.gz.
File metadata
- Download URL: open_document_lib-1.1.0.tar.gz
- Upload date:
- Size: 63.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ec9f2db5c40a6e44e3aa30df512c97cc4c2322bb387835bcabab05b0e8822c4
|
|
| MD5 |
f74b711dd20eddf6fa0ee42fad2ceebc
|
|
| BLAKE2b-256 |
5be27a0c1dd9d0dbaa83a96b78059124e924a7c55d1a29987e0b408e636a31f0
|
Provenance
The following attestation bundles were made for open_document_lib-1.1.0.tar.gz:
Publisher:
publish.yml on leiverkus/open-document-skills
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
open_document_lib-1.1.0.tar.gz -
Subject digest:
5ec9f2db5c40a6e44e3aa30df512c97cc4c2322bb387835bcabab05b0e8822c4 - Sigstore transparency entry: 1601249827
- Sigstore integration time:
-
Permalink:
leiverkus/open-document-skills@447bc8aab260065145def19c9070271244b00a84 -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/leiverkus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@447bc8aab260065145def19c9070271244b00a84 -
Trigger Event:
release
-
Statement type:
File details
Details for the file open_document_lib-1.1.0-py3-none-any.whl.
File metadata
- Download URL: open_document_lib-1.1.0-py3-none-any.whl
- Upload date:
- Size: 24.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1f85f2536850d72c17f9c475083b40c48f15a88232d2782fbaf2f9048e07b80
|
|
| MD5 |
16d5c40d371f094957e6436214c80f91
|
|
| BLAKE2b-256 |
b427223f6ca24ae2ff3ac0b45cb5c27d880c1efe1cc66a0e64a1d1e99fc73854
|
Provenance
The following attestation bundles were made for open_document_lib-1.1.0-py3-none-any.whl:
Publisher:
publish.yml on leiverkus/open-document-skills
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
open_document_lib-1.1.0-py3-none-any.whl -
Subject digest:
d1f85f2536850d72c17f9c475083b40c48f15a88232d2782fbaf2f9048e07b80 - Sigstore transparency entry: 1601249910
- Sigstore integration time:
-
Permalink:
leiverkus/open-document-skills@447bc8aab260065145def19c9070271244b00a84 -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/leiverkus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@447bc8aab260065145def19c9070271244b00a84 -
Trigger Event:
release
-
Statement type: