Skip to main content

Sanitize PO files from gettext for version control

Project description

PyPI Python Versions Build Status Test Coverage Black License

sanpo

sanpo is a command line tool to sanitize PO files from gettext for version control.

The problem

The gettext tool collects text to be translated from source code in PO files that can be sent to translators. These files contain metadata about the project that can be helpful when using an email based workflow.

When creating a PO file the first time, these metadata look like this:

"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2021-09-06 16:16+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

However, when having the PO file under version control, these metadata get in the way. Most of them are available from the commit history. And when running gettext automatically as part of the build process, the PO-Revision-Date gets updated every time even if none of the messages changed, resulting in spuriously modified PO files without any actual changes worth committing.

The solution

Because your localized software does not use the PO files directly but the MO files compiled from them, the unhelpful metadata can be removed. Which is exactly what sanpo does.

A typical build chain would look like this:

  1. gettext - collect PO file
  2. msgfmt - compile into MO file
  3. sanpo - remove unhelpful metadata from PO
  4. commit possible changes in PO file

sanpo simple takes one or more PO files as argument, for example:

sanpo locale/de/LC_MESSAGES/django.po locale/en/LC_MESSAGES/django.po locale/hu/LC_MESSAGES/django.po

After this, the remaining metadata are:

"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

Using the special pattern ** folders can be scanned recursively.

To sanitize PO files for all languages in a certain folder, use for example:

sanpo locale/**/django.po

Django

For Django projects, the typical workflow is:

  1. django-admin makemessages
  2. django-admin compilemessages
  3. sanpo
  4. commit possible changes in PO file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sanpo-0.2.5.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

sanpo-0.2.5-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file sanpo-0.2.5.tar.gz.

File metadata

  • Download URL: sanpo-0.2.5.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for sanpo-0.2.5.tar.gz
Algorithm Hash digest
SHA256 9c643d7cfea068715739c279d684bfe7e1a13aa77c3409015a224fa11d7f910f
MD5 563efbf27c51798a52b2ca7c76311cce
BLAKE2b-256 5722b15970c9514a1d039deb9638fa54c99aefddac47c6528acf907ab8e841f1

See more details on using hashes here.

File details

Details for the file sanpo-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: sanpo-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for sanpo-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 42ded92b536ecf26be1685d1e708ff7c23ec3893480a06946d1b5af35c5bb6fb
MD5 39f914a360848e12afd5e45137d2e1e2
BLAKE2b-256 45892aa4b9e0e52c6c9dc91b0c8c075115d82a525f351aee1075ff1fd824cadf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page