ol-openedx-course-translations

An Open edX plugin to translate courses

These details have not been verified by PyPI

Project description

OL Open edX Course Translations

An Open edX plugin to manage course translations.

Purpose

Translate course content into multiple languages to enhance accessibility for a global audience.

Setup

For detailed installation instructions, please refer to the plugin installation guide.

Installation required in:

Studio (CMS)
LMS

Configuration

Add the following configuration values to the config file in Open edX. For any release after Juniper, that config file is /edx/etc/lms.yml and /edx/etc/cms.yml. If you’re using private.py, add these values to lms/envs/private.py and cms/envs/private.py. These should be added to the top level. Ask a fellow developer for these values.

# Output directory for translated courses
# Default: /openedx/data/course_translations/
COURSE_TRANSLATIONS_BASE_DIR: "/openedx/data/course_translations/"

# Translation providers configuration
TRANSLATIONS_PROVIDERS: {
    "default_provider": "mistral",  # Default provider to use
    "deepl": {
        "api_key": "<YOUR_DEEPL_API_KEY>",
    },
    "openai": {
        "api_key": "<YOUR_OPENAI_API_KEY>",
        "default_model": "gpt-5.2",
    },
    "gemini": {
        "api_key": "<YOUR_GEMINI_API_KEY>",
        "default_model": "gemini-3-pro-preview",
    },
    "mistral": {
        "api_key": "<YOUR_MISTRAL_API_KEY>",
        "default_model": "mistral-large-latest",
    },
}
TRANSLATIONS_GITHUB_TOKEN: <YOUR_GITHUB_TOKEN>
TRANSLATIONS_REPO_PATH: ""
TRANSLATIONS_REPO_URL: "https://github.com/mitodl/mitxonline-translations.git"
LITE_LLM_REQUEST_TIMEOUT: 300  # Timeout for LLM API requests in seconds

For Tutor installations, these values can also be managed through a custom Tutor plugin.

Translation Providers

The plugin supports multiple translation providers:

DeepL
OpenAI (GPT models)
Gemini (Google)
Mistral

Configuration

All providers are configured through the TRANSLATIONS_PROVIDERS dictionary in your settings:

TRANSLATIONS_PROVIDERS = {
    "default_provider": "mistral",  # Optional: default provider for commands
    "deepl": {
        "api_key": "<YOUR_DEEPL_API_KEY>",
    },
    "openai": {
        "api_key": "<YOUR_OPENAI_API_KEY>",
        "default_model": "gpt-5.2",  # Optional: used when model not specified
    },
    "gemini": {
        "api_key": "<YOUR_GEMINI_API_KEY>",
        "default_model": "gemini-3-pro-preview",
    },
    "mistral": {
        "api_key": "<YOUR_MISTRAL_API_KEY>",
        "default_model": "mistral-large-latest",
    },
}

Important Notes:

DeepL Configuration: DeepL must be configured in TRANSLATIONS_PROVIDERS['deepl']['api_key'].
DeepL for Subtitle Repair: DeepL is used as a fallback repair mechanism for subtitle translations when LLM providers fail validation. Even if you use LLM providers for primary translation, you should configure DeepL to enable automatic repair.
Default Models: The default_model in each provider’s configuration is used when you specify a provider without a model (e.g., openai instead of openai/gpt-5.2).

Provider Selection

You can specify providers in three ways:

Provider only (uses default model from settings):

./manage.py cms translate_course \
    --target-language ar \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider openai \
    --srt-translation-provider gemini

Provider with specific model:

./manage.py cms translate_course \
    --target-language ar \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider openai/gpt-5.2 \
    --srt-translation-provider gemini/gemini-3-pro-preview

DeepL (no model needed):

./manage.py cms translate_course \
    --target-language ar \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider deepl \
    --srt-translation-provider deepl

Note: If you specify a provider without a model (e.g., openai instead of openai/gpt-5.2), the system will use the default_model configured in TRANSLATIONS_PROVIDERS for that provider.

Translating a Course

Open the course in Studio.
Go to Tools -> Export Course.
Export the course as a .tar.gz file.
Go to the CMS shell

Run the management command to translate the course:

./manage.py cms translate_course \
    --source-language en \
    --target-language ar \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider openai \
    --srt-translation-provider gemini \
    --translation-validation-provider openai/gpt-5.2 \
    --content-glossary /path/to/content/glossary \
    --srt-glossary /path/to/srt/glossary

Command Options:

--source-language: Source language code (default: en)
--target-language: Target language code (required)
--course-dir: Path to exported course tar.gz file (required)
--content-translation-provider: Translation provider for content (XML/HTML and text) (required).

Format:
- deepl - uses DeepL (no model needed)
- PROVIDER - uses provider with default model from settings (e.g., openai, gemini, mistral)
- PROVIDER/MODEL - uses provider with specific model (e.g., openai/gpt-5.2, gemini/gemini-3-pro-preview, mistral/mistral-large-latest)
--srt-translation-provider: Translation provider for SRT subtitles (required). Same format as --content-translation-provider
--translation-validation-provider: Optional provider to validate/fix XML/HTML translations after translation.
--content-glossary: Path to glossary directory for content (XML/HTML and text) translation (optional)
--srt-glossary: Path to glossary directory for SRT subtitle translation (optional)

Examples:

# Use DeepL for both content and subtitles
./manage.py cms translate_course \
    --target-language ar \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider deepl \
    --srt-translation-provider deepl

# Use OpenAI and Gemini with default models from settings
./manage.py cms translate_course \
    --target-language fr \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider openai \
    --srt-translation-provider gemini

# Use OpenAI with specific model for content, Gemini with default for subtitles
./manage.py cms translate_course \
    --target-language fr \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider openai/gpt-5.2 \
    --srt-translation-provider gemini

# Use Mistral with specific model and separate glossaries for content and SRT
./manage.py cms translate_course \
    --target-language es \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider mistral/mistral-large-latest \
    --srt-translation-provider mistral/mistral-large-latest \
    --content-glossary /path/to/content/glossary \
    --srt-glossary /path/to/srt/glossary

# Use different glossaries for content vs subtitles
./manage.py cms translate_course \
    --target-language es \
    --course-dir /path/to/course.tar.gz \
    --content-translation-provider openai \
    --srt-translation-provider gemini \
    --content-glossary /path/to/technical/glossary \
    --srt-glossary /path/to/conversational/glossary

Glossary Support:

You can use separate glossaries for content and subtitle translation. This allows you to apply different terminology choices based on context:

Content glossary (--content-glossary): Used for XML/HTML content, policy files, and text-based course materials. Typically contains more formal or technical terminology.
SRT glossary (--srt-glossary): Used for subtitle translation. Can contain more conversational or context-specific terms appropriate for spoken content.

Create language-specific glossary files in each glossary directory:

# Content glossary structure
glossaries/technical/
├── ar.txt  # Arabic glossary
├── fr.txt  # French glossary
└── es.txt  # Spanish glossary

# SRT glossary structure
glossaries/conversational/
├── ar.txt  # Arabic glossary
├── fr.txt  # French glossary
└── es.txt  # Spanish glossary

Format: One term per line as “source_term : translated_term”

# es HINTS
## TERM MAPPINGS
These are preferred terminology choices for this language. Use them whenever they sound natural; adapt freely if context requires.

- 'accuracy' : 'exactitud'
- 'activation function' : 'función de activación'
- 'artificial intelligence' : 'inteligencia artificial'
- 'AUC' : 'AUC'

Note: Both glossary arguments are optional. If not provided, translation will proceed without glossary terms. You can provide one, both, or neither glossary as needed.

Subtitle Translation and Validation

The course translation system includes robust subtitle (SRT) translation with automatic validation and retry mechanisms to ensure high-quality translations with preserved timing information.

Translation Process

The subtitle translation follows a multi-stage process with built-in quality checks:

Initial Translation: Subtitles are translated using your configured provider (DeepL or LLM)
Validation: Timestamps, subtitle count, and content are validated to ensure integrity
Automatic Retry: If validation fails, the system automatically retries translation (up to 1 additional attempt)
Task Failure: If all retries fail validation, the translation task fails to prevent corrupted subtitle files

Validation Rules

The system validates subtitle translations against these criteria:

Subtitle Count: Translated file must have the same number of subtitle blocks as the original
Index Matching: Each subtitle block index must match the original (e.g., if original has blocks 1-100, translation must have blocks 1-100 in the same order)
Timestamp Preservation: Start and end times for each subtitle block must remain unchanged
Content Validation: Non-empty original subtitles must have non-empty translations (blank translations are flagged as errors)

Example Validation Process:

1. Initial Translation (using OpenAI):
   ✓ 150 subtitle blocks translated
   ✗ Validation failed: 3 blocks have mismatched timestamps

2. Retry Attempt:
   ✓ 150 subtitle blocks translated
   ✗ Validation failed: 2 blocks still have issues

3. Task Failure:
   ❌ Translation failed after all retries
   ❌ Task aborted to prevent corrupted subtitle files

Failure Handling

If subtitle translation fails after all attempts:

The translation task will fail with a ValueError
The entire course translation will be aborted to prevent incomplete translations
The translated course directory will be automatically cleaned up
An error message will indicate which subtitle file caused the failure
No partial or corrupted translation files will be left behind

Generating static content translations

This command synchronizes translation keys from edx-platform and MFE’s, translates empty keys using LLM, and automatically creates a pull request in the translations repository.

What it does:

Syncs translation keys from edx-platform and MFE’s to the translations repository
Extracts empty translation keys that need translation
Translates empty keys using the specified LLM provider and model
Applies translations to JSON and PO files
Commits changes to a new branch
Creates a pull request with translation statistics

Usage:

Go to the CMS shell

Run the management command:

./manage.py cms sync_and_translate_language <LANGUAGE_CODE> [OPTIONS]

Required arguments:

LANGUAGE_CODE: Language code (e.g., el, fr, es_ES)

Optional arguments:

--iso-code: ISO code for JSON files (default: same as language code)
--provider: Translation provider (openai, gemini, mistral). Default is taken from TRANSLATIONS_PROVIDERS['default_provider'] setting
--model: LLM model name. If not specified, uses the default_model for the selected provider from TRANSLATIONS_PROVIDERS. Examples: gpt-5.2, gemini-3-pro-preview, mistral-large-latest
--repo-path: Path to mitxonline-translations repository (can also be set via TRANSLATIONS_REPO_PATH setting or environment variable)
--repo-url: GitHub repository URL (default: https://github.com/mitodl/mitxonline-translations.git, can also be set via TRANSLATIONS_REPO_URL setting or environment variable)
--glossary: Path to glossary directory (optional). Should contain language-specific files (e.g. {iso_code}.txt).
--batch-size: Number of keys to translate per API request (default: 200, recommended: 200-300 for most models)
--mfe: Filter by specific MFE(s). Use edx-platform for backend translations
--dry-run: Run without committing or creating PR

Examples:

# Use default provider (from TRANSLATIONS_PROVIDERS['default_provider']) with its default model
./manage.py cms sync_and_translate_language el

# Use OpenAI provider with its default model (gpt-5.2)
./manage.py cms sync_and_translate_language el --provider openai

# Use OpenAI provider with a specific model
./manage.py cms sync_and_translate_language el --provider openai --model gpt-5.2

# Use Mistral provider with a specific model and glossary
./manage.py cms sync_and_translate_language el --provider mistral --model mistral-large-latest --glossary /path/to/glossary --batch-size 250

License

The code in this repository is licensed under the AGPL 3.0 unless otherwise noted.

Please see LICENSE.txt for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.1

Apr 28, 2026

0.6.0

Apr 27, 2026

0.5.3

Apr 13, 2026

0.5.2

Mar 25, 2026

This version

0.5.1

Mar 13, 2026

0.5.0

Mar 11, 2026

0.4.3

Mar 2, 2026

0.4.2

Feb 20, 2026

0.4.1

Feb 17, 2026

0.4.0

Feb 17, 2026

0.3.21

Feb 16, 2026

0.3.20

Feb 6, 2026

0.3.19

Feb 3, 2026

0.3.17

Jan 30, 2026

0.3.16

Jan 29, 2026

0.3.15

Jan 28, 2026

0.3.14

Jan 26, 2026

0.3.13

Jan 26, 2026

0.3.12

Jan 26, 2026

0.3.11

Jan 23, 2026

0.3.10

Jan 23, 2026

0.3.9

Jan 22, 2026

0.3.8

Jan 22, 2026

0.3.7

Jan 21, 2026

0.3.6

Jan 20, 2026

0.3.5

Jan 19, 2026

0.3.4

Jan 19, 2026

0.3.3

Jan 16, 2026

0.3.2

Jan 15, 2026

0.3.1

Jan 15, 2026

0.3.0

Jan 14, 2026

0.2.0

Dec 24, 2025

0.1.1

Dec 8, 2025

0.1.0

Dec 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ol_openedx_course_translations-0.5.1.tar.gz (116.4 kB view details)

Uploaded Mar 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ol_openedx_course_translations-0.5.1-py3-none-any.whl (135.2 kB view details)

Uploaded Mar 13, 2026 Python 3

File details

Details for the file ol_openedx_course_translations-0.5.1.tar.gz.

File metadata

Download URL: ol_openedx_course_translations-0.5.1.tar.gz
Upload date: Mar 13, 2026
Size: 116.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ol_openedx_course_translations-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`3f294f02f69ec3dbbb0f9ca404dd69bde8ddaec316b3b21f7efbbcdfc0092861`
MD5	`29fb678a2ee910f1bbb2199d0268871a`
BLAKE2b-256	`8d554a8362ba963fef80df4567b4b584d3c92c35b10f6efd3a4ca977626a96e7`

See more details on using hashes here.

File details

Details for the file ol_openedx_course_translations-0.5.1-py3-none-any.whl.

File metadata

Download URL: ol_openedx_course_translations-0.5.1-py3-none-any.whl
Upload date: Mar 13, 2026
Size: 135.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ol_openedx_course_translations-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a91be210f52f57c7059a7dd44f18f9bafd564d76c65c0583ae1fc7b12c2a26e3`
MD5	`df95869de18eb33372b6a6141c539364`
BLAKE2b-256	`0762005a64b71a17660bf66517ff4b5e113da5ebbbd31d179c43bcf4764de901`

See more details on using hashes here.

ol-openedx-course-translations 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

OL Open edX Course Translations

Purpose

Setup

Configuration

Translation Providers

Translating a Course

Subtitle Translation and Validation

Generating static content translations

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes