Python library for manipulating, creating and editing tmx files

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Project description

Hypomnema

The industrial-grade TMX framework for Python.

Hypomnema is a strictly typed, policy-driven parser and generator for the TMX 1.4b standard. It provides a robust infrastructure for building Localization and NLP tools, designed to handle messy translation memories without crashing.

🚀 Why this library?

Most TMX parsers are simple XML wrappers. Hypomnema is an infrastructure library offering:

🛡️ Policy-Driven Recovery: Configure exactly how to handle errors (missing segments, extra text, invalid tags). Choose between raise, ignore, log, or repair.
🔌 Backend Agnostic: Runs on lxml for speed or standard xml.etree for zero-dependency environments.
✨ Type Safe: Fully annotated with modern Python 3.12+ types. Returns structured Dataclasses, not raw XML nodes.
🏗️ Symmetrical: Deserialize XML to Objects, manipulate them, and Serialize back to XML with roundtrip integrity.

📦 Installation

pip install hypomnema
OR
uv add hypomnema

For maximum performance, install with lxml support and use the LxmlBackend:

pip install "hypomnema[lxml]"
OR
uv add hypomnema[lxml]

⚡ Usage (Low-Level API)

Note: v0.4.2 exposes the core architecture components. Better docs and high-level convenience facades (load/dump) are coming in v0.5.

1. Deserializing (Reading)

To parse a file, you compose a Backend (the parser) with a Deserializer (the logic).

import xml.etree.ElementTree as ET
import hypomnema as hm

# 1. Initialize the Backend
backend = hm.StandardBackend()

# 2. Initialize the Deserializer
deserializer = hm.Deserializer(backend=backend)

# 3. Parse content (using standard ET for I/O in this example)
tree = ET.parse("memory.tmx")
root_element = tree.getroot()

# 4. Deserialize to Python Objects
tmx: hm.Tmx = deserializer.deserialize(root_element)

print(f"Source Language: {tmx.header.srclang}")
for tu in tmx.body:
    print(f"TU: {tu.tuid}")

2. Handling Dirty Data (Policies)

Real-world TMX files are often broken. Configure a DeserializationPolicy to handle errors gracefully.

If not specified, the default policy is strict on purpose to fail fast and prevent silent data corruption.

You can configure also configure the logging level for each policy value independently of its behavior.

import hypomnema as hm
import logging

# Configure a permissive policy
policy = hm.DeserializationPolicy()

# If a <tuv> has no <seg>, don't crash -> ignore the error (returns empty content)
hm.policy.missing_seg = PolicyValue("ignore", logging.WARNING)

# If a <tu> has garbage text between tags, ignore it
policy.extra_text = hm.PolicyValue("ignore", logging.INFO)

deserializer = hm.Deserializer(backend=backend, policy=policy)
tmx = deserializer.deserialize(root_element)

3. Serializing (Writing)

from datetime import datetime, timezone
import hypomnema as hm

# 1. Build the object tree
tmx_obj = hm.Tmx(
    version="1.4",
    header=hm.Header(
        creationtool="MyScript",
        creationtoolversion="1.0",
        segtype=hm.Segtype.SENTENCE,
        o_tmf="JSON",
        adminlang="en-US",
        srclang="en-US",
        datatype="plaintext",
        creationdate=datetime.now(timezone.utc)
    ),
    body=[
        hm.Tu(
            tuid="1",
            srclang="en-US",
            variants=[
                hm.Tuv(lang="en-US", content=["Hello World"]),
                hm.Tuv(lang="fr-FR", content=["Bonjour le monde"])
            ]
        )
    ]
)

# 2. Serialize to XML Element
serializer = hm.Serializer(backend=backend)
xml_root = serializer.serialize(tmx_obj)

# 3. Write to file (using backend specifics)
ET.ElementTree(xml_root).write("output.tmx", encoding="utf-8", xml_declaration=True)

🧩 Architecture

The library is built on three decoupled layers:

Backend Layer: Abstracts the XML parser. LxmlBackend (fast, features) vs StandardBackend (portable).
Orchestration Layer: Serializer and Deserializer classes that manage recursion and dispatch.
Handler Layer: Specialized classes (TuvDeserializer, NoteSerializer) that implement the business logic and policy checks for specific TMX elements.

🛠️ Advanced Usage

Working with Mixed Content (Tags)

TMX segments often contain inline markup like placeholders (<ph>) or formatting (<bpt>). hypomnema parses these into a mixed list of strings and objects.

import hypomnema as hm

# Content is a list of strings and Inline objects
# XML: Hello <ph x="1">Name</ph>
print(variant.content) 
# Output: ["Hello ", Ph(x=1, content=["Name"])]

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

0.8

Apr 9, 2026

0.7

Feb 25, 2026

0.6

Jan 28, 2026

0.5.0

Jan 15, 2026

0.4.4

Dec 19, 2025

0.4.3

Dec 15, 2025

This version

0.4.2

Dec 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hypomnema-0.4.2.tar.gz (23.8 kB view details)

Uploaded Dec 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hypomnema-0.4.2-py3-none-any.whl (28.5 kB view details)

Uploaded Dec 4, 2025 Python 3

File details

Details for the file hypomnema-0.4.2.tar.gz.

File metadata

Download URL: hypomnema-0.4.2.tar.gz
Upload date: Dec 4, 2025
Size: 23.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.5

File hashes

Hashes for hypomnema-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`4dab731e7d0050ea2008d2166adf679b11179f4a56dca07f5b5091d59ba36eb4`
MD5	`f610087942c76af2d916d16726762c50`
BLAKE2b-256	`53b3bb1831a522cf0e335fa89968b8b16f4dc71b717a277c2b1db39aa5d108ce`

See more details on using hashes here.

File details

Details for the file hypomnema-0.4.2-py3-none-any.whl.

File metadata

Download URL: hypomnema-0.4.2-py3-none-any.whl
Upload date: Dec 4, 2025
Size: 28.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.5

File hashes

Hashes for hypomnema-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`696b0602715624bdf3ce4b3daccc356a25a80577956ac88d99efc7f234944bad`
MD5	`22bfb63cf4f2f77f06411f21e23cc2be`
BLAKE2b-256	`7e2c1cd5ecd181d9ae3605d9a5083bd27b0292850f938f42a3e63d0907b34a3f`

See more details on using hashes here.

hypomnema 0.4.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Hypomnema

🚀 Why this library?

📦 Installation

⚡ Usage (Low-Level API)

1. Deserializing (Reading)

2. Handling Dirty Data (Policies)

3. Serializing (Writing)

🧩 Architecture

🛠️ Advanced Usage

Working with Mixed Content (Tags)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes