Skip to main content

Simple Language Support for Web Development

Project description

Simple LANGuage support for the Web (using AI)

License: MIT Conventional Commits ruff prettier pre-commit PyPI build coverage docs

Overview

Use AI models from Hugging Face to translate your website.

The system works with two different approaches:

  • Dynamic: Translation on-the-fly. It's easy to integrate with any framework. Can be slow if the text is too long.
  • Static: Use a translation lookup file based on sentences. To use a key based approach would require an extra layer of complexity (maybe in the future). The lookup file must be created before deployment. This approach is harder (sometimes impossible) to integrate with any framework, for example, Flask + jinja2 templates. It's fast.

At the moment, only ROMANCE languages are included by using the model Helsinki-NLP/opus-mt-en-ROMANCE. This model can translate to the following languages:

| Language | Code | Language | Code | Language | Code | | ---------------------------- | ----- | --------------------- | ----- | ---------- | ---- | -------- | --- | | Spanish | es | Spanish (Uruguay) | es_uy | Neapolitan | nap | | Spanish (Argentina) | es_ar | Spanish (Venezuela) | es_ve | Sicilian | scn | | Spanish (Chile) | es_cl | Portuguese | pt | Venetian | vec | | Spanish (Colombia) | es_co | Portuguese (Brazil) | pt_br | Aragonese | an | | Spanish (Costa Rica) | es_cr | Portuguese (Portugal) | pt_pt | Arpitan | frp | | Spanish (Dominican Republic) | es_do | French | fr | Corsican | co | Friulian | fur | | Spanish (Ecuador) | es_ec | French (Belgium) | fr_be | Ladin | lld | | Spanish (El Salvador) | es_sv | French (Switzerland) | fr_ch | Ladino | lad | | Spanish (Guatemala) | es_gt | French (Canada) | fr_ca | Latin | la | | Spanish (Honduras) | es_hn | French (France) | fr_fr | Ligurian | lij | | Spanish (Mexico) | es_mx | Italian | it | Mirandese | mwl | | Spanish (Nicaragua) | es_ni | Italian (Italy) | it_it | Occitan | oc | | Spanish (Panama) | es_pa | Catalan | ca | Romansh | rm | | Spanish (Peru) | es_pe | Galician | gl | Sardinian | sc | | Spanish (Puerto Rico) | es_pr | Romanian | ro | Walloon | wa | | Spanish (Spain) | es_es | Lombard | lmo |

This package creates a folder inside your repo to store a configuration file and other files for the models.

Installation

Simply install via pip:

pip install slangweb

Initialization

Let's suppose you have the following folder structure:

my_site/
├── app.py            # main application entry
├── src/              # source package / modules
│   ├── index.py      # main site logic / translator usage example
└── pages/            # HTML/templates/pages for the site
    └── a_page.html   # example module representing a page

Open a terminal, activate the environment in which you installed the package, and run:

(.venv) C:\my_site>slangweb init

This will create the configuration file and the models lookup file.

Configuration file

The configuration file (json) has the following structure:

{
  "base_folder": "slangweb",
  "models_lookup_file": "models_lookup.json",
  "models_folder": "models",
  "lookups_folder": "lookups",
  "default_language": "en",
  "encoding": "utf-8",
  "source_folders": ["."],
  "supported_languages": ["es"],
  "translator_class": "SW"
}
  • base_folder: is the main folder where all files will be stored (including the config file).
  • models_lookup_file: name of the models lookup file. This file will and must be placed inside base_folder.
  • models_folder: folder where the models will and must be stored. Also, must be inside base_folder.
  • lookups_folder: folder where the translations lookup files will be stored.
  • default_language: The base language of the site. At the moment only english is supported.
  • encoding: Encoding for the lookup files. At the moment only utf-8 is supported.
  • source_folders: Folders that contain the source python file where the slangweb translator class is implemented. Developers can modify this at will.
  • supported_languages: Languages that the site will support. There will be one translation lookup file for each language.
  • translator_class: The class that will be used for static translations across the site. See the Usage section.

Models lookup

The models_lookup.json has the following structure:

{
    "es": {
        "model": "Helsinki-NLP/opus-mt-en-ROMANCE",
        "name": "Spanish"
    },
    ...
}

This file created automatically. Other languages and models can be added if needed.

Usage

Once all the configuration was created and modified (if needed), you need to download the models using the CLI application:

(.venv) C:\my_site>slangweb download-models

This will download all the models needed for the languages included in the section supported_languages in the configuration file.

Finally, you can start implementing it in your python files. There are two main ways of using this package: statically and dynamically

1. Static

For each language listed in the section supported_languages in the configuration file a translation lookup file will be created inside the lookups_folder. The translation lookup file is a json containing all relations between the sentences in the original language and the translated version. For example (spanish):

es.json

{
    "Hello World": "Hola Mundo",
    ...
}

The purpose of this approach it to avoid translating on-the-fly to gain loading speed.

To use the static translation system you can call the instance, which is the same as calling the method .get_translation:

from slangweb import Translator
SW = Translator()
translation = SW("Translate this")
same_translation = SW.get_translation("Translate this")

Example using Dash:

from slangweb import Translator

# Init Translator
# the variable name must match the "translator_class" in the config file
SW = Translator()

def layout(lang: str = 'en'):
    SW.set_language(lang)
    return html.Div([
        html.H2(SW('This is Test for the static translation system.')),
        html.H2(SW("Thanks for using SlangWeb!"))
    ])

There are 2 ways to create the translation lookup files:

  1. by running the website in localhost and accessing the pages.
  2. by running the CLI:
(.venv) C:\my_site>slangweb sync

This will create the following file C:\my_site\slangweb\lookups\es.json

{
  "This is a Test for the static translation system.": "Esta es una prueba para el sistema de traducción estática.",
  "Thanks for using SlangWeb!": "¡Gracias por usar SlangWeb!"
}

2. Dynamic

In this case, the translation lookup file will not be created, and the translation will happen on-the-fly.

In your code (using Dash):

from slangweb import Translator

# Init Translator
SW = Translator()
t = SW.translate

def layout(lang: str = 'en'):
    SW.set_language(lang)
    return html.Div([
        html.H2(t('This is Test for the static translation system.')),
        html.H2(t("Thanks for using SlangWeb!"))
    ])

Recommendations & caveats

  • Model downloads can be large; ensure enough disk space.
  • For production, prefer Static lookups where possible for performance.
  • Dynamic translation may add latency; consider caching translations.
  • If using private Hugging Face models, set the HF_TOKEN environment variable before running CLI/tools:
setx HF_TOKEN "your_token_here"

Credits

This package was created with Copier and the @12rambau/pypackage 0.1.18 project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slangweb-0.0.0.tar.gz (24.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slangweb-0.0.0-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file slangweb-0.0.0.tar.gz.

File metadata

  • Download URL: slangweb-0.0.0.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for slangweb-0.0.0.tar.gz
Algorithm Hash digest
SHA256 96dc9ee9d171054274b71827f0dab58ee967111ecf7fdcabcbdd884904a35f3a
MD5 aaef38ae907891ef3e15d9172d2f9b25
BLAKE2b-256 370bcc0b0eb37cda505a801357831c1d303c64dd8f0b66326b5a95a86a4e0d78

See more details on using hashes here.

File details

Details for the file slangweb-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: slangweb-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for slangweb-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 38673fcfa4ceb29d297b7da596d6037c6ec66e9cd91cbc50459f8fdc01d46d73
MD5 de80f2670c00824660327e96dac448ea
BLAKE2b-256 293d4445f87eb29c0e5138ee46441460bb06bf0e7f9a51857f77f161961d4bea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page