An utility library that generates OARepo required data model files from a JSON specification file

Project description

OARepo model builder

A library and command-line tool to generate invenio model from a single model file.

CLI Usage

oarepo-compile-model model.yaml

will compile the model.yaml into the current directory. Options:

  --output-directory <dir> Output directory where the generated files will be
                           placed. Defaults to "."
  --package <name>         Package into which the model is generated. If not
                           passed, the name of the current directory,
                           converted into python package name, is used.
  --set <name=value>       Overwrite option in the model file. 
                           Example --set settings.elasticsearch.keyword-ignore-above=20
  -v                       Increase the verbosity. This option can be used
                           multiple times.
  --config <filename>      Load a config file and replace parts of the model
                           with it. The config file can be a json, yaml or a
                           python file. If it is a python file, it is
                           evaluated with the current model stored in the
                           "oarepo_model" global variable and after the
                           evaluation all globals are set on the model.
  --isort / --skip-isort   Call isort on generated sources (default: yes)
  --black / --skip-black   Call black on generated sources (default: yes)

Model file structure

A model is a json/yaml file with the following structure:

settings:
  python:
  elasticsearch:
model:
  properties:
    title: { type: 'fulltext' }

There might be more sections (documentation etc.), but only the settings and model are currently processed.

settings section

The settings section might contain the following keys (default values below):

settings:
  package: basename(output dir) with '-' converted to '_'
  kebap-package: to_kebap(package)
  package-path: path to package as python Path instance
  schema-version: 1.0.0
  schema-name: { kebap-package }-{schema-version}.json
  schema-file: full path to generated json schema
  mapping-file: full path to generated mapping
  collection-url: camel_case(last component of package)

  processing-order: [ 'settings', '*', 'model' ]

  python:
    record-prefix: camel_case(last component of package)
    templates: { }   # overridden templates
    marshmallow:
      top-level-metadata: true
      mapping: { }

    record-prefix-snake: snake_case(record_prefix)

    record-class: { settings.package }.record.{record_prefix}Record
      # full record class name with package
    record-schema-class: { settings.package }.schema.{record_prefix}Schema
      # full record schema class name (apart from invenio stuff, contains only metadata field)
    record-schema-metadata-class: { settings.package }.schema.{record_prefix}MetadataSchema
      # full record schema metadata class name (contains model schema as marshmallow)
    record-schema-metadata-alembic: { settings.package_base }
    # name of key in pyproject.toml invenio_db.alembic entry point 
    record-metadata-class: { settings.package }.metadata.{record_prefix}Metadata
      # db class to store record's metadata 
    record-metadata-table-name: { record_prefix.lower() }_metadata
      # name of database table for storing metadata 
    record-permissions-class: { settings.package }.permissions.{record_prefix}PermissionPolicy
      # class containing permissions for the record
    record-dumper-class: { settings.package }.dumper.{record_prefix}Dumper
      # record dumper class for elasticsearch
    record-search-options-class: { settings.package }.search_options.{record_prefix}SearchOptions
      # search options for the record
    record-service-config-class: { settings.package }.service_config.{record_prefix}ServiceConfig
      # configuration of record's service
    record-resource-config-class: { settings.package }.resource.{record_prefix}ResourceConfig
      # configuration of record's resource
    record-resource-class: { settings.package }.resource.{record_prefix}Resource
      # record resource
    record-resource-blueprint-name: { record_prefix }
    # blueprint name of the resource 
    register-blueprint-function: { settings.package }.blueprint.register_blueprint'
      # name of the blueprint registration function

  elasticsearch:
    keyword-ignore-above: 50

  plugins:
    packages: [ ]
    # list of extra packages that should be installed in compiler's venv
    output|builder|model|property:
      # plugin types - file outputs, builders, model preprocessors, property preprocessors 
      disabled: [ ]
      # list of plugin names to disable
      # string "__all__" to disable all plugins in this category    
      enabled:
      # list of plugin names to enable. The plugins will be used
      # in the order defined. Use with disabled: __all__
      # list of "module:className" that will be added at the end of
      # plugin list

model section

The model section is a json schema that might be annotated with extra sections. For example:

model:
  properties:
    title:
      type: multilingual
      oarepo:ui:
        label: Title
        class: bold-text
      oarepo:documentation: |
        Lorem ipsum ...
        Dolor sit ...

Note: multilingual is a special type (not defined in this library) that is translated to the correct schema, mapping and marshmallow files with a custom PropertyPreprocessor.

oarepo:ui gives information for the ui output

oarepo:documentation is a section that is currently ignored

Referencing a model

API Usage

To generate invenio model from a model file, perform the following steps:

Load the model into a ModelSchema instance

from oarepo_model_builder.schema import ModelSchema
from oarepo_model_builder.loaders import yaml_loader

included_models = {
    'my_model': lambda parent_model: {'test': 'abc'} 
}
loaders = {'yaml': yaml_loader}

model = ModelSchema(file_path='test.yaml', 
                    included_models=included_models, 
                    loaders=loaders)

You can also path directly the content of the file path in content attribute

The included_models is a mapping between model key and its accessor. It is used to replace any oarepo:use element. See the Referencing a model above.

The loaders handle loading of files - the key is lowercased file extension, value a function taking (schema, path) and returning loaded content

Create an instance of ModelBuilder

To use the pre-installed set of builders and preprocessors, invoke:

from oarepo_model_builder.entrypoints \ 
 import create_builder_from_entrypoints

builder = create_builder_from_entrypoints()

To have a complete control of builders and preprocessors, invoke:

   from oarepo_model_builder.builder import ModelBuilder
   from oarepo_model_builder.builders.jsonschema import JSONSchemaBuilder
   from oarepo_model_builder.builders.mapping import MappingBuilder
   from oarepo_model_builder.outputs.jsonschema import JSONSchemaOutput
   from oarepo_model_builder.outputs.mapping import MappingOutput
   from oarepo_model_builder.outputs.python import PythonOutput
   from oarepo_model_builder.property_preprocessors.text_keyword import TextKeywordPreprocessor
   from oarepo_model_builder.model_preprocessors.default_values import DefaultValuesModelPreprocessor
   from oarepo_model_builder.model_preprocessors.elasticsearch import ElasticsearchModelPreprocessor

   builder = ModelBuilder(
     output_builders=[JSONSchemaBuilder, MappingBuilder],
     outputs=[JSONSchemaOutput, MappingOutput, PythonOutput],
     model_preprocessors=[DefaultValuesModelPreprocessor, ElasticsearchModelPreprocessor],
     property_preprocessors=[TextKeywordPreprocessor]
   )

Invoke

   builder.build(schema, output_directory)

Extending the builder

Builder pipeline

Pipeline

At first, an instance of ModelSchema is obtained. The schema can be either passed the content of the schema as text, or just a path pointing to the file. The extension of the file determines which loader is used. JSON, JSON5 and YAML are supported out of the box ( if you have json5 and pyyaml packages installed)

Then ModelBuilder.build(schema, output_dir) is called.

It begins with calling all ModelPreprocessors. They get the whole schema and settings and can modify both. See ElasticsearchModelPreprocessor as an example. The deepmerge function does not overwrite values if they already exist in settings.

For each of the outputs (jsonschema, mapping, record, resource, ...) the top-level properties of the transformed schema are then iterated. The order of the top-level properties is given by settings.processing-order.

The top-level property and all its descendants (a visitor patern, visiting property by property), a PropertyPreprocessor is called.

The preprocessor can either modify the property, decide to remove it or replace it with a new set of properties (see multilang in tests ).

The property is then passed to the OutputBuilder (an example is JSONSchemaBuilder) that serializes the tree of properties into the output.

The output builder does not create files on the filesystem explicitly but uses instances of OutputBase, for example JSONOutput or more specialized JSONSchemaOutput.

See JSONBaseBuilder for an example of how to get an output and write to it (in this case, the json-based output).

This way, even if more output builders access the same file, their access is coordinated.

Registering Preprocessors, Builders and Outputs for commandline client

The model & property preprocessors, output builders and outputs are registered in entry points. In poetry, it looks as:

[tool.poetry.plugins."oarepo_model_builder.builders"]
010-jsonschema = "oarepo_model_builder.builders.jsonschema:JSONSchemaBuilder"
020-mapping = "oarepo_model_builder.builders.mapping:MappingBuilder"
030-python_structure = "oarepo_model_builder.builders.python_structure:PythonStructureBuilder"
040-invenio_record = "oarepo_model_builder.invenio.invenio_record:InvenioRecordBuilder"

[tool.poetry.plugins."oarepo_model_builder.ouptuts"]
jsonschema = "oarepo_model_builder.outputs.jsonschema:JSONSchemaOutput"
mapping = "oarepo_model_builder.outputs.mapping:MappingOutput"
python = "oarepo_model_builder.outputs.python:PythonOutput"

[tool.poetry.plugins."oarepo_model_builder.property_preprocessors"]
010-text_keyword = "oarepo_model_builder.preprocessors.text_keyword:TextKeywordPreprocessor"

[tool.poetry.plugins."oarepo_model_builder.model_preprocessors"]
01-default = "oarepo_model_builder.transformers.default_values:DefaultValuesModelPreprocessor"
10-invenio = "oarepo_model_builder.transformers.invenio:InvenioModelPreprocessor"
20-elasticsearch = "oarepo_model_builder.transformers.elasticsearch:ElasticsearchModelPreprocessor"

[tool.poetry.plugins."oarepo_model_builder.loaders"]
json = "oarepo_model_builder.loaders:json_loader"
json5 = "oarepo_model_builder.loaders:json_loader"
yaml = "oarepo_model_builder.loaders:yaml_loader"
yml = "oarepo_model_builder.loaders:yaml_loader"

[tool.poetry.plugins."oarepo_model_builder.templates"]
99-base_templates = "oarepo_model_builder.invenio.templates"

Generating python files

The default python output is based on libCST that enables merging generated code with a code that is already present in output files. The transformer provided in this package can:

Add imports
Add a new class or function on top-level
Add a new method to an existing class
Add a new const/property to an existing class

The transformer will not touch an existing function/method. Increase verbosity level to get a list of rejected patches or add --set settings.python.overwrite=true (use with caution, with sources stored in git and do diff afterwards).

Overriding default templates

The default templates are written as jinja2-based templates.

To override a single or multiple templates, create a package containing the templates and register it in oarepo_model_builder.templates. Be sure to specify the registration key smaller than 99-. The template loader iterates the sorted set of keys and your templates would be loaded before the default ones. Example:

my_package
   +-- __init__.py
   +-- templates
       +-- invenio_record.py.jinja2

# my_package/__init__.py
TEMPLATES = {
 # resolved relative to the package
 "record": "templates/invenio_record.py.jinja2"
}

[tool.poetry.plugins."oarepo_model_builder.templates"]
20-my_templates = "my_package"

To override a template for a single model, in your model file (or configuration file with -c option or via --set option) , specify the relative path to the template:

settings:
  python:
    templates:
      record: ./test/my_invenio_record.py.jinja2

Project details

Release history Release notifications | RSS feed

4.0.84

Apr 15, 2024

4.0.83

Mar 27, 2024

4.0.82

Mar 21, 2024

4.0.81

Mar 13, 2024

4.0.80

Feb 29, 2024

4.0.79

Feb 29, 2024

4.0.78

Feb 28, 2024

4.0.77

Feb 28, 2024

4.0.76

Feb 27, 2024

4.0.75

Feb 26, 2024

4.0.74

Feb 20, 2024

4.0.72

Feb 14, 2024

4.0.71

Dec 12, 2023

4.0.70

Nov 29, 2023

4.0.69

Nov 17, 2023

4.0.68

Nov 16, 2023

4.0.67

Nov 14, 2023

4.0.66

Nov 13, 2023

4.0.65

Nov 10, 2023

4.0.64

Nov 8, 2023

4.0.63

Nov 1, 2023

4.0.62

Oct 30, 2023

4.0.61

Oct 30, 2023

4.0.60

Oct 29, 2023

4.0.59

Oct 29, 2023

4.0.58

Oct 29, 2023

4.0.57

Oct 25, 2023

4.0.56

Oct 20, 2023

4.0.55

Oct 20, 2023

4.0.54

Oct 12, 2023

4.0.53

Oct 4, 2023

4.0.52

Oct 3, 2023

4.0.51

Oct 2, 2023

4.0.50

Sep 25, 2023

4.0.49

Sep 25, 2023

4.0.48

Sep 23, 2023

4.0.47

Sep 23, 2023

4.0.46

Sep 23, 2023

4.0.45

Sep 22, 2023

4.0.44

Sep 19, 2023

4.0.43

Sep 19, 2023

4.0.42

Sep 12, 2023

4.0.41

Sep 11, 2023

4.0.40

Sep 10, 2023

4.0.39

Sep 6, 2023

4.0.38

Aug 31, 2023

4.0.37

Aug 29, 2023

4.0.36

Aug 29, 2023

4.0.35

Jul 28, 2023

4.0.34

Jul 24, 2023

4.0.33

Jul 22, 2023

4.0.32

Jul 20, 2023

4.0.31

Jul 17, 2023

4.0.30

Jul 17, 2023

4.0.29

Jul 17, 2023

4.0.28

Jul 17, 2023

4.0.27

Jul 17, 2023

4.0.26

Jul 14, 2023

4.0.25

Jul 13, 2023

4.0.24

Jul 10, 2023

4.0.23

Jul 10, 2023

4.0.22

Jul 3, 2023

4.0.21

Jul 3, 2023

4.0.20

Jun 27, 2023

4.0.19

Jun 15, 2023

4.0.18

Jun 15, 2023

4.0.17

Jun 12, 2023

4.0.16

Jun 12, 2023

4.0.15

Jun 5, 2023

4.0.14

Jun 5, 2023

4.0.13

Jun 4, 2023

4.0.12

May 31, 2023

4.0.11

May 31, 2023

4.0.10

May 30, 2023

4.0.9

May 30, 2023

4.0.8

May 26, 2023

4.0.7

May 26, 2023

4.0.6

May 25, 2023

4.0.5

May 25, 2023

4.0.4

May 24, 2023

4.0.3

May 24, 2023

4.0.2

May 22, 2023

4.0.1

May 20, 2023

4.0.0

May 19, 2023

3.2.74

Apr 19, 2023

3.2.73

Apr 18, 2023

3.2.72

Apr 18, 2023

3.2.71

Apr 18, 2023

3.2.70

Apr 18, 2023

3.2.69

Apr 14, 2023

3.2.68

Apr 13, 2023

3.2.67

Apr 11, 2023

3.2.66

Apr 10, 2023

3.2.65

Apr 10, 2023

3.2.64

Apr 10, 2023

3.2.63

Apr 6, 2023

3.2.62

Apr 6, 2023

3.2.61

Apr 5, 2023

3.2.60

Apr 5, 2023

3.2.59

Apr 4, 2023

3.2.58

Mar 29, 2023

3.2.57

Mar 27, 2023

3.2.56

Mar 25, 2023

3.2.55

Mar 25, 2023

3.2.54

Mar 24, 2023

3.2.53

Mar 24, 2023

3.2.52

Mar 22, 2023

3.2.51

Mar 17, 2023

3.2.50

Mar 17, 2023

3.2.49

Mar 17, 2023

3.2.48

Mar 15, 2023

3.2.47

Mar 12, 2023

3.2.46

Mar 12, 2023

3.2.45

Mar 11, 2023

3.2.44

Mar 11, 2023

3.2.43

Mar 10, 2023

3.2.42

Mar 10, 2023

3.2.41

Mar 10, 2023

3.2.40

Mar 10, 2023

3.2.39

Mar 9, 2023

3.2.38

Mar 9, 2023

3.2.37

Mar 8, 2023

3.2.35

Mar 8, 2023

3.2.34

Mar 7, 2023

3.2.33

Mar 6, 2023

3.2.32

Mar 6, 2023

3.2.31

Mar 5, 2023

3.2.30

Mar 1, 2023

3.2.29

Feb 28, 2023

3.2.28

Feb 28, 2023

3.2.27

Feb 28, 2023

3.2.26

Feb 28, 2023

3.2.25

Feb 28, 2023

3.2.24

Feb 27, 2023

3.2.23

Feb 25, 2023

3.2.22

Feb 24, 2023

3.2.21

Feb 23, 2023

3.2.20

Feb 23, 2023

3.2.19

Feb 23, 2023

3.2.18

Feb 23, 2023

3.2.17

Feb 23, 2023

3.2.16

Feb 23, 2023

3.2.15

Feb 22, 2023

3.2.14

Feb 22, 2023

3.2.13

Feb 22, 2023

3.2.12

Feb 22, 2023

3.2.11

Feb 22, 2023

3.2.10

Feb 21, 2023

3.2.9

Feb 21, 2023

3.2.8

Feb 21, 2023

3.2.7

Feb 20, 2023

3.2.6

Feb 13, 2023

3.2.5

Feb 13, 2023

3.2.4

Feb 13, 2023

3.2.3

Feb 13, 2023

3.2.2

Feb 12, 2023

3.2.1

Feb 9, 2023

3.2.0

Feb 9, 2023

3.1.9

Feb 9, 2023

3.1.8

Feb 9, 2023

3.1.7

Feb 8, 2023

3.1.6

Feb 8, 2023

3.1.5

Feb 7, 2023

3.1.4

Feb 7, 2023

3.1.3

Feb 6, 2023

3.1.2

Feb 5, 2023

3.1.1

Feb 1, 2023

3.1.0

Feb 1, 2023

3.0.0

Jan 26, 2023

2.2.0

Jan 19, 2023

2.1.3

Jan 18, 2023

2.1.2

Jan 18, 2023

2.1.1

Jan 17, 2023

2.0.1

Jan 13, 2023

2.0.0a1 pre-release

Dec 10, 2022

1.0.0a6 pre-release

Dec 2, 2022

1.0.0a5 pre-release

Nov 30, 2022

1.0.0a4 pre-release

Nov 30, 2022

1.0.0a3 pre-release

Nov 29, 2022

1.0.0a2 pre-release

Nov 29, 2022

1.0.0a1 pre-release

Nov 29, 2022

1.0.0.dev44 pre-release

Nov 28, 2022

1.0.0.dev43 pre-release

Nov 28, 2022

1.0.0.dev42 pre-release

Nov 24, 2022

1.0.0.dev41 pre-release

Nov 3, 2022

1.0.0.dev40 pre-release

Nov 3, 2022

1.0.0.dev39 pre-release

Oct 27, 2022

1.0.0.dev38 pre-release

Sep 26, 2022

1.0.0.dev37 pre-release

Sep 26, 2022

1.0.0.dev36 pre-release

Sep 26, 2022

1.0.0.dev35 pre-release

Sep 26, 2022

1.0.0.dev34 pre-release

Sep 26, 2022

1.0.0.dev33 pre-release

Sep 26, 2022

1.0.0.dev32 pre-release

Sep 21, 2022

1.0.0.dev31 pre-release

Sep 13, 2022

1.0.0.dev30 pre-release

Sep 13, 2022

1.0.0.dev29 pre-release

Sep 6, 2022

1.0.0.dev28 pre-release

Sep 6, 2022

1.0.0.dev27 pre-release

Sep 6, 2022

1.0.0.dev26 pre-release

Sep 6, 2022

1.0.0.dev25 pre-release

Sep 6, 2022

1.0.0.dev24 pre-release

Sep 6, 2022

1.0.0.dev23 pre-release

Sep 5, 2022

1.0.0.dev22 pre-release

Sep 5, 2022

1.0.0.dev21 pre-release

Sep 5, 2022

1.0.0.dev20 pre-release

Sep 4, 2022

1.0.0.dev19 pre-release

Sep 3, 2022

1.0.0.dev18 pre-release

Sep 2, 2022

1.0.0.dev17 pre-release

Sep 2, 2022

1.0.0.dev16 pre-release

Sep 2, 2022

1.0.0.dev15 pre-release

Sep 2, 2022

1.0.0.dev14 pre-release

Sep 2, 2022

1.0.0.dev13 pre-release

Sep 2, 2022

1.0.0.dev12 pre-release

Sep 2, 2022

1.0.0.dev11 pre-release

Sep 2, 2022

1.0.0.dev10 pre-release

Sep 2, 2022

1.0.0.dev9 pre-release

Jul 22, 2022

1.0.0.dev8 pre-release

Jul 21, 2022

1.0.0.dev7 pre-release

May 3, 2022

1.0.0.dev6 pre-release

Apr 1, 2022

1.0.0.dev5 pre-release

Mar 15, 2022

1.0.0.dev4 pre-release

Mar 7, 2022

1.0.0.dev3 pre-release

Mar 7, 2022

1.0.0.dev2 pre-release

Mar 7, 2022

1.0.0.dev1 pre-release

Mar 7, 2022

0.9.37

Mar 5, 2022

0.9.36

Mar 1, 2022

0.9.35

Mar 1, 2022

0.9.34

Feb 24, 2022

0.9.33

Feb 20, 2022

0.9.32

Feb 18, 2022

0.9.31

Feb 17, 2022

0.9.30

Feb 17, 2022

0.9.29

Feb 17, 2022

0.9.27

Feb 8, 2022

0.9.25

Feb 8, 2022

0.9.24

Feb 6, 2022

0.9.23

Feb 6, 2022

0.9.22

Feb 6, 2022

0.9.21

Feb 5, 2022

0.9.20

Feb 5, 2022

0.9.19

Feb 5, 2022

0.9.18

Feb 3, 2022

0.9.17

Feb 3, 2022

0.9.16

Feb 3, 2022

0.9.13

Jan 27, 2022

0.9.12

Jan 25, 2022

0.9.11

Jan 24, 2022

0.9.10

Jan 23, 2022

0.9.9

Jan 23, 2022

0.9.8

Jan 23, 2022

0.9.7

Jan 16, 2022

0.9.6

Jan 14, 2022

0.9.5

Jan 12, 2022

0.9.4

Jan 9, 2022

0.9.3

Jan 4, 2022

0.9.2

Dec 6, 2021

This version

0.9.1

Nov 30, 2021

0.9.0

Nov 30, 2021

0.1.7

Aug 5, 2021

0.1.6

Aug 5, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oarepo-model-builder-0.9.1.tar.gz (36.1 kB view hashes)

Uploaded Nov 30, 2021 Source

Built Distribution

oarepo_model_builder-0.9.1-py3-none-any.whl (52.7 kB view hashes)

Uploaded Nov 30, 2021 Python 3

Hashes for oarepo-model-builder-0.9.1.tar.gz

Hashes for oarepo-model-builder-0.9.1.tar.gz
Algorithm	Hash digest
SHA256	`d815ae300975d513e0ec7a464427a0711db793ef0264df5ae726a840d36b86f4`
MD5	`c77c9048a776d5285efa3b0175c12164`
BLAKE2b-256	`304937bcc09e19710c7d0e9e7a0ed8376c053b560f90be7bca01bba93cac5303`

Hashes for oarepo_model_builder-0.9.1-py3-none-any.whl

Hashes for oarepo_model_builder-0.9.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6808d54d1a63e9ca5a96873f41729b5788db1db642b6ee3a1b2e31d631b69d52`
MD5	`726e20e87b67473ff62d6c96643ef7fa`
BLAKE2b-256	`1b373ec83c069a2c3bc34b0f0714915bfddaa96d30af0a18f253c97a485c2331`