Skip to main content

Utilities for extracting and applying translations to multilingual SVG files.

Project description

SVG Translation Tool

PyPi Version

This tool extracts multilingual text pairs from SVG files and applies translations to other SVG files by inserting missing <text systemLanguage="XX"> blocks.

Installation

This tool requires Python 3.10+. Install the lightweight core dependencies with:

pip install CopySVGTranslation

Usage

Extracting and injecting in a single step

from pathlib import Path
from CopySVGTranslation import svg_extract_and_inject

tree = svg_extract_and_inject(
    extract_file=Path("examples/source_multilingual.svg"),
    inject_file=Path("examples/target_missing_translations.svg"),
    data_output_file = Path("examples/data.json"),
    save_result=True,
)

if tree is not None:
    print("Injection completed!")

The helper stores the extracted phrases under Path("examples/data.json") and, when save_result=True, writes the translated SVG to output_dir=Path("./translated"). If you also need statistics about how many translations were inserted, call the lower level injector with return_stats=True:

from CopySVGTranslation.injection import inject

tree, stats = inject(
    inject_file="examples/target_missing_translations.svg",
    mapping_files=["CopySVGTranslation/data/source_multilingual.svg.json"],
    output_dir=Path("./translated"),
    save_result=True,
    return_stats=True,
)

print(stats)

Injecting with pre-translated data

When you already have the translation JSON, load it and use inject directly. Important parameters include overwrite to update existing translations and output_dir to control where translated files are written.

from pathlib import Path
from CopySVGTranslation import inject

translations = {
    "new": {
        "Hello": {"ar": "مرحبًا", "fr": "Bonjour"},
    }
}

tree, stats = inject(
    inject_file=Path("examples/target_missing_translations.svg"),
    all_mappings=translations,
    output_dir=Path("./translated"),
    overwrite=True,
    save_result=True,
    return_stats=True,
)

print("Saved to", Path("./translated/target_missing_translations.svg"))
print(stats)

Data Model

The extractor writes a JSON document rooted under the "new" key. Each entry maps normalized English text to a dictionary of language codes and translations. An example of the modern format:

{
  "new": {
    "but are connected in anti-phase": {
      "ar": "لكنها موصولة بمرحلتين متعاكستين."
    }
  }
}

Older exports may omit the wrapper and look like {"english": {"ar": "…"}}. The injector transparently accepts both structures, but the recommended format is the nested "new" layout shown above.

Extract Example

Input SVG (arabic.svg)

<?xml version="1.0"?>
<svg xmlns="http://www.w3.org/2000/svg">
  <switch>
      <text id="t0-ar" systemLanguage="ar">
          <tspan id="t0-ar">الموسيقى في عام 2020</tspan>
      </text>
      <text id="t0-fr" systemLanguage="fr">
          <tspan id="t0-fr">La musique en 2020</tspan>
      </text>
      <text id="t0">
          <tspan id="t0">Music in 2020</tspan>
      </text>
  </switch>
  <switch>
      <text id="t0-ar" systemLanguage="ar">
          <tspan id="t0-ar">مرحبا</tspan>
      </text>
      <text id="t0-fr" systemLanguage="fr">
          <tspan id="t0-fr">Bonjour</tspan>
      </text>
      <text id="t0">
          <tspan id="t0">Hello</tspan>
      </text>
  </switch>
</svg>

Python code

from pathlib import Path
from CopySVGTranslation import extract

translations = extract(
    svg_file_path=Path("arabic.svg"),
    case_insensitive=True,
)

Extracted JSON

{
    "new": {
        "music in 2020": {
            "ar": "الموسيقى في عام 2020",
            "fr": "La musique en 2020"
        },
        "hello": {
            "ar": "مرحبا",
            "fr": "Bonjour"
        }
    },
    "title": {
        "music in": {
            "ar": "الموسيقى في عام",
            "fr": "La musique en"
        }
    },
    "tspans_by_id": {
        "t0": "Hello"
    }
}

Injection Example

  • TODO

Testing

Run the unit tests:

python -m pytest tests -v

Implementation Details

Text Normalization

The tool normalizes text by:

  • Trimming leading and trailing whitespace
  • Replacing multiple internal whitespace characters with a single space
  • Optionally converting to lowercase for case-insensitive matching

ID Generation

When adding new translation nodes, the tool generates unique IDs by:

  • Taking the existing ID and appending the language code (e.g., text2213 becomes text2213-ar)
  • If the generated ID already exists, appending a numeric suffix until unique (e.g., text2213-ar-1)

Error Handling

The tool includes comprehensive error handling for:

  • Missing input files
  • Invalid XML structure
  • Missing required attributes
  • File permission issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

copysvgtranslation-0.1.7.tar.gz (189.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

copysvgtranslation-0.1.7-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file copysvgtranslation-0.1.7.tar.gz.

File metadata

  • Download URL: copysvgtranslation-0.1.7.tar.gz
  • Upload date:
  • Size: 189.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for copysvgtranslation-0.1.7.tar.gz
Algorithm Hash digest
SHA256 9aa2e823b239aad078f392806d05a0b6093fd0e8d29160de4553f90e9abe8337
MD5 190ac76e2fc22c5386c074c8d1ed714e
BLAKE2b-256 39e21f187262df1c349a842f1faf5a3fb956cfcffa17086242a197cbb1900c91

See more details on using hashes here.

Provenance

The following attestation bundles were made for copysvgtranslation-0.1.7.tar.gz:

Publisher: python-publish.yml on MrIbrahem/CopySVGTranslation

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file copysvgtranslation-0.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for copysvgtranslation-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0b6b5e7400cf891a6cd417d0334a58e9b243bba9661b951804701aa58a5c777c
MD5 9026efca99d5798c53149331b4989526
BLAKE2b-256 7c65f4e5a8a93261bb991439682cd6a53b4e7ea0592766b21d986dc290a78de4

See more details on using hashes here.

Provenance

The following attestation bundles were made for copysvgtranslation-0.1.7-py3-none-any.whl:

Publisher: python-publish.yml on MrIbrahem/CopySVGTranslation

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page