Skip to main content

Multilingual support for OARepo

Project description

OARepo multilingual data model

image image image image

Multilingual string data model for OARepo.

Instalation

    pip install oarepo-multilingual

Usage

The library provides multilingual type for json schema with marshmallow validation and deserialization and elastic search mapping. Multilingual is type which allows you to add multilingual strings in your json schema in format "en":"something, "en-us":"something else" or default value "_" : "default value"

JSON Schema

Add this package to your dependencies and use it via $ref in json schema as "[server]/schemas/multilingual-v2.0.0.json#/definitions/multilingual".

Usage example

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "title": {
            "$ref": "https://localhost:5000/schemas/multilingual-v2.0.0.json#/definitions/multilingual"
      }
  }
}
{
  "type": "object",
  "properties": {
    "title": {
            "en": "something",
            "en-us": "something else"
      }
  }
}

Marshmallow

For data validation and deserialization.

If marshmallow validation is performed within application context, languages are validated against SUPPORTED_LANGUAGES config. If the validation is performed outside app context, the keys are not checked against a list of languages but a generic validation is performed - keys must be in ISO 639-1 or language-region format from RFC 5646.

Usage example

    class MD(marshmallow.Schema):
         title = MultilingualStringSchemaV2()

    data = {
        'title':
            {
            "en": "something",
            "en-us": "something else",
            }
        }

    MD().load(data)

Supported languages validation

You can specified supported languages in your application configuration in SUPPORTED_LANGUAGES . Then only these languages are allowed as multilingual string. You must specified your languages in format "en" or "en-us".

Usage example

app.config.update(SUPPORTED_LANGUAGES = ["cs", "en"])

Elastic search mapping

Define type of your multilingual string as multilingual Add to your configuration definition of ELASTICSEARCH_DEFAULT_LANGUAGE_TEMPLATE which will be used as default mapping template for supported languages.

Default template example

ELASTICSEARCH_DEFAULT_LANGUAGE_TEMPLATE={
            "type": "text",
            "fields": {
                "keywords": {
                    "type": "keyword"
                }
            }
        }

You can also specified different templates for specific languages with ELASTICSEARCH_LANGUAGE_TEMPLATES. Use # and id for adding more templates for one specific language

Language templates example

ELASTICSEARCH_LANGUAGE_TEMPLATES={
        "cs": {
            "type": "text",
            "fields": {
                "keywords": {
                    "type": "keyword"
                }
            }
        },
        "cs#plain": {
            "type": "text",
        },
        "en": {
            "type": "text",
            "fields": {
                "keywords": {
                    "type": "keyword"
                }
            }
        }
    }

It can be used a placeholder '' instead of particular language and schema will be used for all SUPPORTED LANGUAGES. The placeholder '' can be used in whole schema at the any level. Currently suported placeholeder is only *, but it will be changed.

ELASTICSEARCH_LANGUAGE_TEMPLATES={
        "*#context": {
            "type": "text",
            "copy_to": "field.*",
            "fields": {
                "raw": {
                    "type": "keyword"
                }
            }
        }

    }

Usage example

{
  "mappings": {
    "properties": {
    "title":
      {"type": "multilingual"}
    }
  }
}

Usage example with context

{
  "mappings": {
    "properties": {
    "title":
      {"type": "multilingual#plain"}
    }
  }
}

Analyzer configuration

You can specified analysis in app configuration with ELASTICSEARCH_LANGUAGE_ANALYSIS. Use # and id for adding more analysis for one specific language.

Language analysis example

ELASTICSEARCH_LANGUAGE_ANALYSIS= {
        "cs#title": {"czech#title": {
        "type": "custom",
        "char_filter": [
            "html_strip"
        ],
        "tokenizer": "standard"
        }},
        "cs": {"czech": {
            "type": "custom",
            "char_filter": [
                "html_strip"
            ],
            "tokenizer": "standard",
            "filter": [
                "lowercase",
                "stop",
                "snowball"
            ]
        }}
    }

Usage example

{
"settings":{
      "analysis": {
        "analyzer": {
         "oarepo:extends": "multilingual_analysis"
          }
      }
},
"mappings": {
   ...
}
}
{
"settings":{
      "analysis": {
        "analyzer": {
         "oarepo:extends": "multilingual_analysis#title"
          }
      }
},
"mappings": {
   ...
}
}

Changes

Version 2.5.0 (released 2021-03-24)

Added

  • Added placeholder option instead specify particular language

Version 2.0.0 (released 2020-08-21)

  • Initial public release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oarepo_multilingual-2.7.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

oarepo_multilingual-2.7.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file oarepo_multilingual-2.7.0.tar.gz.

File metadata

  • Download URL: oarepo_multilingual-2.7.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for oarepo_multilingual-2.7.0.tar.gz
Algorithm Hash digest
SHA256 ec3fb0b6979394236fc9d0d0cc17058c598edd6463df21ba04b2886d2c980fcc
MD5 e44fb57f8da4e751f153f80d2fb6bf0e
BLAKE2b-256 f16669a15678162030e5084cd080b43bc44b3b25a62ad29f032c865dedc7954a

See more details on using hashes here.

File details

Details for the file oarepo_multilingual-2.7.0-py3-none-any.whl.

File metadata

  • Download URL: oarepo_multilingual-2.7.0-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for oarepo_multilingual-2.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4eaeaca48ed1e21c7f7941752a56fe8a4e0e56d9445f4e7b235b2d39d2df9e9e
MD5 d450ae2da8f9488db58053252cdbc805
BLAKE2b-256 3225c832270c1ef17872b2c3a5495e564944f6151fc1a8dda0e4b891552a5116

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page