Skip to main content

An inclusion mechanism for elasticsearch mappings

Project description

OAREPO mapping includes

image image image image

This package adds support for inclusions in elasticsearch mappings.

Example

A title, abstract and description are multilingual strings that look like

{
  en: 'English version',
  cs: 'Czech version'
}

As elasticsearch does not have support for includes, the mapping for the three properties would be quite large. With this library, you can create a mapping, for example called multilingual-v1.0.0.json and reference it:

// multilingual-v1.0.0.json
{
    "text": {
        "type": "object",
        "properties": {
            "en": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 100
                    }
                } 
            },
            cs: {
                "type": "text",
                "analyzer": "czech",
                "fields": {
                    "keyword": {
                        "ignore_above": 100,
                        "type": "icu_collation_keyword",
                        "language": "cs",
                    }                    
                }
            }
        }
    },
    "analysis": {
        // definition of czech analyzer
    }
}
// main mapping
{
    "settings": {
        "analysis": {
            "oarepo:extends": "multilingual-v1.0.0.json#/analysis"
        }
    },
    "mappings": {
        "properties": {
            "title": {
                "type": "multilingual-v1.0.0.json#/text",
                // extra properties for title might go here and are merged in
            },
            "description": {
                "type": "multilingual-v1.0.0.json#/text"
            },
            "abstract": {
                "type": "multilingual-v1.0.0.json#/text"
            }
        }
    }
}

The included mapping might be located inside invenio with no external url, hosted on a web server and referenced by http:// or https:// or even generated dynamically on demand.

Installation

pip install oarepo-mapping-includes

Configuration

This library has to know where included mappings (if not hosted on external server) are located. Specify this in entrypoints (my_repo is the top-level package of your repository):

setup(
    # ...
    entry_points={
        "oarepo_mapping_includes": [
            "my_repo = my_repo.mapping_includes"
        ],
        "oarepo_mapping_handlers": [
            "something_discussed_later = my_repo.mapping_handlers:dynamic"
        ]
    }
)

Included files location

The oarepo_mapping_includes is supposed to have the following structure, same as in invenio:

my_repo
    +- mapping_includes                     <-- as defined in entrypoint 
        +- v7                               <-- ES version
            +- multilingual-v1.0.0.json     <-- this is referenced in type, oarepo:extends

Supported constructs

type

Looks if the type is either an external resource (http://, https://) or a registered internal resource. If not, it is left intact, otherwise the definition is obtained in the following way:

  1. If there are any mapping handlers mapped to the value of the type, they are used
  2. resource (without # part if present) is fetched from internal cache external uri
  3. if # is not a part of the type, the whole resource is returned
  4. if the first character after hash is /, it is assumed that it is an json pointer and is applied. The result of the json pointer is returned. Error is raised if the path does not exist
  5. Otherwise an element containing $id property with this value is obtained. Error is raised if element with this id does not exist

The definition is then merged with any other elements present at the same level, conflicting values are overwritten (think of inheritance in python).

Array of types

Multiple types are supported.

{
  // ...
  "title": {
    "type": [
      "multilingual-v1.0.0.json#/text",
      "copy-v1.0.0.json",
    ]
  }
}

Where copy-v1.0.0.json might contain:

{
  "copy_to": "all_fields"
}

On conflict, similar algorithm to python inheritance is used

oarepo:extends

oarepo:extends behaves exactly the same way as type but can be used anywhere in the mapping

Dynamic handlers

Sometimes it would be better if the mapping was dynamically created. For example, the number of supported languages varies from installation to installation and the supported languages are specified in invenio.cfg

In entry points, define oarepo_mapping_handlers. The left hand side before '=' is what should match the type, extend. It might be the full value or the value before #.

The handler's signature is:

def handler(type=None, resource=None, id=None, json_pointer=None, app=None, 
            content=None, root=None, content_pointer=None, **kwargs):
    """
    :param type         the type as literally written in "type" or "extends" properties
    :param resource     part of the type before '#'
    :param id           part of the type after '#' if it does not start with '/'  
    :param json_pointer part of the type after '#' if it starts with '/'
    :param app          current flask application. Use app.config to get the current config
    :param content      json element containing the ``type`` 
    :param root         the whole mapping
    :param content_pointer 
                        json pointer of the content element
    :param **kwargs     think of extensibility
    """
    return {...}

Merging and replacing content

The handler can return either a dictionary or an instance of oarepo_mapping_includes.Mapping.

If it returns a dictionary it is merged with the original mapping content (such as extra properties etc.)

If it returns a Mapping(mapping=<dict>, merge=True), the parameter merge defines if the original mapping content will be merged in (True) or completely replaced (False).

This is usable if the handler wants to transform the original content, not simply to merge it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oarepo-mapping-includes-1.4.4.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

oarepo_mapping_includes-1.4.4-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file oarepo-mapping-includes-1.4.4.tar.gz.

File metadata

  • Download URL: oarepo-mapping-includes-1.4.4.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for oarepo-mapping-includes-1.4.4.tar.gz
Algorithm Hash digest
SHA256 6e8e6b3b7f6a56dce8a8144feb66f785ca1f01cd91a8f4cf002d26544bf85a17
MD5 08ef03d17653e18b504f61065d2f314d
BLAKE2b-256 4eb337234488432bf51aa9e7a43559ee0db76517f7639a16b4295505daca9e8f

See more details on using hashes here.

File details

Details for the file oarepo_mapping_includes-1.4.4-py3-none-any.whl.

File metadata

  • Download URL: oarepo_mapping_includes-1.4.4-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for oarepo_mapping_includes-1.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b0b8a091854eaad23a5de462c0a0fa273784cfb13959a7f997e11851d95a40fb
MD5 79b06f2f73cbfb4037d40c6a851f635d
BLAKE2b-256 310cf70149b4f5677d201e69e17cafabcffdaf0c9217de4060b3a63637c15085

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page