Skip to main content

An inclusion mechanism for elasticsearch mappings

Project description

OAREPO mapping includes

image image image image

This package adds support for inclusions in elasticsearch mappings.

Example

A title, abstract and description are multilingual strings that look like

{
  en: 'English version',
  cs: 'Czech version'
}

As elasticsearch does not have support for includes, the mapping for the three properties would be quite large. With this library, you can create a mapping, for example called multilingual-v1.0.0.json and reference it:

// multilingual-v1.0.0.json
{
    "text": {
        "type": "object",
        "properties": {
            "en": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 100
                    }
                } 
            },
            cs: {
                "type": "text",
                "analyzer": "czech",
                "fields": {
                    "keyword": {
                        "ignore_above": 100,
                        "type": "icu_collation_keyword",
                        "language": "cs",
                    }                    
                }
            }
        }
    },
    "analysis": {
        // definition of czech analyzer
    }
}
// main mapping
{
    "settings": {
        "analysis": {
            "oarepo:extends": "multilingual-v1.0.0.json#/analysis"
        }
    },
    "mappings": {
        "properties": {
            "title": {
                "type": "multilingual-v1.0.0.json#/text",
                // extra properties for title might go here and are merged in
            },
            "description": {
                "type": "multilingual-v1.0.0.json#/text"
            },
            "abstract": {
                "type": "multilingual-v1.0.0.json#/text"
            }
        }
    }
}

The included mapping might be located inside invenio with no external url, hosted on a web server and referenced by http:// or https:// or even generated dynamically on demand.

Installation

pip install oarepo-mapping-includes

Configuration

This library has to know where included mappings (if not hosted on external server) are located. Specify this in entrypoints (my_repo is the top-level package of your repository):

setup(
    # ...
    entry_points={
        "oarepo_mapping_includes": [
            "my_repo = my_repo.mapping_includes"
        ],
        "oarepo_mapping_handlers": [
            "something_discussed_later = my_repo.mapping_handlers:dynamic"
        ]
    }
)

Included files location

The oarepo_mapping_includes is supposed to have the following structure, same as in invenio:

my_repo
    +- mapping_includes                     <-- as defined in entrypoint 
        +- v7                               <-- ES version
            +- multilingual-v1.0.0.json     <-- this is referenced in type, oarepo:extends

Supported constructs

type

Looks if the type is either an external resource (http://, https://) or a registered internal resource. If not, it is left intact, otherwise the definition is obtained in the following way:

  1. If there are any mapping handlers mapped to the value of the type, they are used
  2. resource (without # part if present) is fetched from internal cache external uri
  3. if # is not a part of the type, the whole resource is returned
  4. if the first character after hash is /, it is assumed that it is an json pointer and is applied. The result of the json pointer is returned. Error is raised if the path does not exist
  5. Otherwise an element containing $id property with this value is obtained. Error is raised if element with this id does not exist

The definition is then merged with any other elements present at the same level, conflicting values are overwritten (think of inheritance in python).

Array of types

Multiple types are supported.

{
  // ...
  "title": {
    "type": [
      "multilingual-v1.0.0.json#/text",
      "copy-v1.0.0.json",
    ]
  }
}

Where copy-v1.0.0.json might contain:

{
  "copy_to": "all_fields"
}

On conflict, similar algorithm to python inheritance is used

oarepo:extends

oarepo:extends behaves exactly the same way as type but can be used anywhere in the mapping

Dynamic handlers

Sometimes it would be better if the mapping was dynamically created. For example, the number of supported languages varies from installation to installation and the supported languages are specified in invenio.cfg

In entry points, define oarepo_mapping_handlers. The left hand side before '=' is what should match the type, extend. It might be the full value or the value before #.

The handler's signature is:

def handler(type=None, resource=None, id=None, json_pointer=None, app=None, 
            content=None, root=None, content_pointer=None, **kwargs):
    """
    :param type         the type as literally written in "type" or "extends" properties
    :param resource     part of the type before '#'
    :param id           part of the type after '#' if it does not start with '/'  
    :param json_pointer part of the type after '#' if it starts with '/'
    :param app          current flask application. Use app.config to get the current config
    :param content      json element containing the ``type`` 
    :param root         the whole mapping
    :param content_pointer 
                        json pointer of the content element
    :param **kwargs     think of extensibility
    """
    return {...}

Merging and replacing content

The handler can return either a dictionary or an instance of oarepo_mapping_includes.Mapping.

If it returns a dictionary it is merged with the original mapping content (such as extra properties etc.)

If it returns a Mapping(mapping=<dict>, merge=True), the parameter merge defines if the original mapping content will be merged in (True) or completely replaced (False).

This is usable if the handler wants to transform the original content, not simply to merge it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oarepo-mapping-includes-1.4.1.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oarepo_mapping_includes-1.4.1-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file oarepo-mapping-includes-1.4.1.tar.gz.

File metadata

  • Download URL: oarepo-mapping-includes-1.4.1.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.6

File hashes

Hashes for oarepo-mapping-includes-1.4.1.tar.gz
Algorithm Hash digest
SHA256 dcb6ba16fa833f46921c8f6d5ebd6b9bc6908f0e518870828b27e382afde94d0
MD5 afe0c863fabba06f85e0e1324e10b657
BLAKE2b-256 902d4322a698ba7578c87cd50bb96a8493ccd0fd0e4c3eb8dfc885a50197b6e7

See more details on using hashes here.

File details

Details for the file oarepo_mapping_includes-1.4.1-py3-none-any.whl.

File metadata

  • Download URL: oarepo_mapping_includes-1.4.1-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.6

File hashes

Hashes for oarepo_mapping_includes-1.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4024565f44395607016a4d050b54d9d5ddeb9ebd3befdfba9ceee06e3785c0bf
MD5 89a56ae4bdb7b18a2ccba24be9fef305
BLAKE2b-256 47601d8f205fdb4e5b3f4b5516fe3d85f05f9c6ac644ead6ccb39d71f9c0f63f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page