Multilingual support for OARepo
Project description
OARepo multilingual data model
Multilingual string data model for OARepo.
Instalation
pip install oarepo-multilingual
Usage
The library provides multilingual type for json schema with marshmallow validation and deserialization and elastic search mapping.
Multilingual is type which allows you to add multilingual strings in your json schema in format "en":"something, "en-us":"something else"
or default value "_" : "default value"
JSON Schema
Add this package to your dependencies and use it via $ref
in json schema as "[server]/schemas/multilingual-v2.0.0.json#/definitions/multilingual"
.
Usage example
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"title": {
"$ref": "https://localhost:5000/schemas/multilingual-v2.0.0.json#/definitions/multilingual"
}
}
}
{
"type": "object",
"properties": {
"title": {
"en": "something",
"en-us": "something else"
}
}
}
Marshmallow
For data validation and deserialization.
If marshmallow validation is performed within application context, languages are validated against SUPPORTED_LANGUAGES config. If the validation is performed outside app context, the keys are not checked against a list of languages but a generic validation is performed - keys must be in ISO 639-1 or language-region format from RFC 5646.
Usage example
class MD(marshmallow.Schema):
title = MultilingualStringSchemaV2()
data = {
'title':
{
"en": "something",
"en-us": "something else",
}
}
MD().load(data)
Supported languages validation
You can specified supported languages in your application configuration in SUPPORTED_LANGUAGES
. Then only these
languages are allowed as multilingual string.
You must specified your languages in format "en"
or "en-us"
.
Usage example
app.config.update(SUPPORTED_LANGUAGES = ["cs", "en"])
Elastic search mapping
Define type of your multilingual string as multilingual
Add to your configuration definition of ELASTICSEARCH_DEFAULT_LANGUAGE_TEMPLATE
which will be used as default mapping template for supported languages.
Default template example
ELASTICSEARCH_DEFAULT_LANGUAGE_TEMPLATE={
"type": "text",
"fields": {
"keywords": {
"type": "keyword"
}
}
}
You can also specified different templates for specific languages with ELASTICSEARCH_LANGUAGE_TEMPLATES
. Use #
and id
for adding more
templates for one specific language
Language templates example
ELASTICSEARCH_LANGUAGE_TEMPLATES={
"cs": {
"type": "text",
"fields": {
"keywords": {
"type": "keyword"
}
}
},
"cs#plain": {
"type": "text",
},
"en": {
"type": "text",
"fields": {
"keywords": {
"type": "keyword"
}
}
}
}
It can be used a placeholder '' instead of particular language and schema will be used for all SUPPORTED LANGUAGES. The placeholder '' can be used in whole schema at the any level. Currently suported placeholeder is only *, but it will be changed.
ELASTICSEARCH_LANGUAGE_TEMPLATES={
"*#context": {
"type": "text",
"copy_to": "field.*",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
Usage example
{
"mappings": {
"properties": {
"title":
{"type": "multilingual"}
}
}
}
Usage example with context
{
"mappings": {
"properties": {
"title":
{"type": "multilingual#plain"}
}
}
}
Analyzer configuration
You can specified analysis in app configuration with ELASTICSEARCH_LANGUAGE_ANALYSIS
. Use #
and id
for adding more
analysis for one specific language.
Language analysis example
ELASTICSEARCH_LANGUAGE_ANALYSIS= {
"cs#title": {"czech#title": {
"type": "custom",
"char_filter": [
"html_strip"
],
"tokenizer": "standard"
}},
"cs": {"czech": {
"type": "custom",
"char_filter": [
"html_strip"
],
"tokenizer": "standard",
"filter": [
"lowercase",
"stop",
"snowball"
]
}}
}
Usage example
{
"settings":{
"analysis": {
"analyzer": {
"oarepo:extends": "multilingual_analysis"
}
}
},
"mappings": {
...
}
}
{
"settings":{
"analysis": {
"analyzer": {
"oarepo:extends": "multilingual_analysis#title"
}
}
},
"mappings": {
...
}
}
Changes
Version 2.5.0 (released 2021-03-24)
Added
- Added placeholder option instead specify particular language
Version 2.0.0 (released 2020-08-21)
- Initial public release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file oarepo_multilingual-2.7.0.tar.gz
.
File metadata
- Download URL: oarepo_multilingual-2.7.0.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec3fb0b6979394236fc9d0d0cc17058c598edd6463df21ba04b2886d2c980fcc |
|
MD5 | e44fb57f8da4e751f153f80d2fb6bf0e |
|
BLAKE2b-256 | f16669a15678162030e5084cd080b43bc44b3b25a62ad29f032c865dedc7954a |
File details
Details for the file oarepo_multilingual-2.7.0-py3-none-any.whl
.
File metadata
- Download URL: oarepo_multilingual-2.7.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4eaeaca48ed1e21c7f7941752a56fe8a4e0e56d9445f4e7b235b2d39d2df9e9e |
|
MD5 | d450ae2da8f9488db58053252cdbc805 |
|
BLAKE2b-256 | 3225c832270c1ef17872b2c3a5495e564944f6151fc1a8dda0e4b891552a5116 |